Hi, I am trying to use STAR-2.7.7a to align library to my genome assembly, but I received these messages:
Jan 28 23:33:02 ..... started STAR run
Jan 28 23:33:02 ... starting to generate Genome files
EXITING because of INPUT ERROR: could not open genomeFastaFile: /scratch/slin023/scratch/star/asm.contigs.filtered.fasta
Jan 28 23:33:02 ...... FATAL ERROR, exiting
EXITING because of fatal input ERROR: could not open readFilesIn=/home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_1P_second.fastq.gz,
Jan 28 23:33:02 ...... FATAL ERROR, exiting
Here is my script:
#!/bin/bash
#SBATCH --qos pq_mdegenna
#SBATCH --account iacc_mdegenna
#SBATCH --partition IB_16C_96G
#SBATCH -n 16
#SBATCH -N 1
#SBATCH --output=log
export PATH=$PATH:/home/slin023/STAR-2.7.7a/bin/Linux_x86_64/
##build Genome indices
/home/slin023/STAR-2.7.7a/bin/Linux_x86_64/STAR --runThreadN 16 --runMode genomeGenerate --genomeDir /scratch/mdegenna/slin023/star/ --genomeFastaFiles /scratch/slin023/scratch/star/asm.contigs.filtered.fasta
##mapping to genome indices
/home/slin023/STAR-2.7.7a/bin/Linux_x86_64/STAR --runThreadN 16 --genomeDir /scratch/mdegenna/slin023/star/ --readFilesIn /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_1P_second.fastq.gz, /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_2P_second.fastq.gz, /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-male-Phormia_Filtered_1P_second.fastq.gz, /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-male-Phormia_Filtered_2P_second.fastq.gz
I checked the space, it seems no problem, I also checked whether the input read files exist:
ls -l /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_1P_second.fastq.gz
-rw-r--r-- 1 slin023 hpc_mdegenna 13251209254 Jan 20 17:55 /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_1P_second.fastq.gz
any advice is welcomed, thank you for your time
Hello, thank you for answering. I have tried it, but some input still can't be found:
here is my script:
Not sure if we made progress but it looks like the program ran for 40 min or so. Did it produce any alignment files? You will need to figure out why gzip can't find those sequence files. Are those actually gzipped?
What do you get if you do
file /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_1P_second.fastq.gz
?Also take this warning into consideration, if you have not already done so.
When I typed what you suggested, I got this :
And I did get output alignment file, but it's empty :
so does that mean I have to add
--genomeSAindexNbases 13
to scale it down?Are you working on windows? The file seems to be present in that location which is good. Can you show us output of
zcat /home/data/FLAG/PhormiatranscriptsSep2020/Blowfly-female-Phormia_Filtered_1P_second.fastq.gz | head -4
. Just want to make sure the file is compressed using gzip.And you are correct in that the alignment file is empty. Can you check
Log*
files to see if there is any diagnostic information in them?the zcat command shows:
I am not sure if I type it right
And no, I am working on Mac
The data file looks good and should be readable by STAR.
I was asking you to open those Log files (you can use textedit on Mac) and see if there are additional error messages in there that may help us.
this is "Log.final.out" file
is it worth to uncompressed all the files and run it?
this is "Log.out" file
No. Your alignment is working.
You have not provided a crucial option
--readFilesIn
right before where you provide those fastq file names. Can you add that? You also need to separate the list of R1 and R2 files by a space.Hello, I actually took someone's advice and gunzip the files, I received these: recommended --genomeSAindexNbases 13
how to fix the string length, and just in case, here is my script, which I separate the R1 & R2 files by space:
Looks like your fastq files are messed up in some way. What have you done to them in terms of trimming etc? I suggest you go back to original and let STAR handle soft-clipping parts of reads that don't map.
Hello, just let you know that I figured it out. Trimmomatic messed up the files since Star will auto-trim, so I used the original library, created unsorted bam file and sorted the output with Samtools. Thank you for your help!