Hi Bioinformaticians,
I run STAR with this command:
STAR --runThreadN ${NSLOTS} \
--runMode genomeGenerate \
--genomeDir /home/doan/hg38/hg38_index_new \
--genomeFastaFiles /home/doan/hg38/Homo_sapiens.GRCh38.dna_sm.prima$
--sjdbGTFfile /home/doan/hg38/Homo_sapiens.GRCh38.107.gtf \
--sjdbOverhang 99
I got this output:
chrLength.txt geneInfo.tab sjdbInfo.txt
chrNameLength.txt Genome sjdbList.fromGTF.out.tab
chrName.txt genomeParameters.txt sjdbList.out.tab
chrStart.txt Log.out transcriptInfo.tab
exonGeTrInfo.tab SA
exonInfo.tab SAindex
The size of this genome indices is 28Gb. Then I run alignment but the size of all output files is 0. Would anyone please tell me what is wrong?
Creating genome indices took less than 1 hour with the output I listed above but the alignment is less than 1 minute so as you said there was something wrong here. A submitted job on the server, a job will disappear from status when it finishes.
Are you sure the genome generate job completed successfully i.e. there were no errors? Can you show a listing of the files above so we can see their sizes? e.g.
ls -lh *
?I run STAR by submitting the script so maybe the error if exists, it won't show as run directly from the shell.
-rw-r-----. 1 doanc2 doanc2 1.2K Aug 4 14:54 chrLength.txt
-rw-r-----. 1 doanc2 doanc2 3.1K Aug 4 14:54 chrNameLength.txt
-rw-r-----. 1 doanc2 doanc2 1.9K Aug 4 14:54 chrName.txt
-rw-r-----. 1 doanc2 doanc2 2.1K Aug 4 14:54 chrStart.txt
-rw-r-----. 1 doanc2 doanc2 56M Aug 4 14:53 exonGeTrInfo.tab
-rw-r-----. 1 doanc2 doanc2 23M Aug 4 14:54 exonInfo.tab
-rw-r-----. 1 doanc2 doanc2 2.4M Aug 4 14:53 geneInfo.tab
-rw-r-----. 1 doanc2 doanc2 3.0G Aug 4 15:38 Genome
-rw-r-----. 1 doanc2 doanc2 844 Aug 4 15:38 genomeParameters.txt
-rw-r-----. 1 doanc2 doanc2 34K Aug 4 15:38 Log.out
-rw-r-----. 1 doanc2 doanc2 24G Aug 4 15:38 SA
-rw-r-----. 1 doanc2 doanc2 1.5G Aug 4 15:38 SAindex
-rw-r-----. 1 doanc2 doanc2 12M Aug 4 15:34 sjdbInfo.txt
-rw-r-----. 1 doanc2 doanc2 12M Aug 4 14:54 sjdbList.fromGTF.out.tab
-rw-r-----. 1 doanc2 doanc2 8.8M Aug 4 15:34 sjdbList.out.tab
-rw-r-----. 1 doanc2 doanc2 16M Aug 4 14:54 transcriptInfo.tab
I am not sure if this error is related or not but it is the content of a file name star.e101042.
/usr/global/sge/default/spool/fenn03/job_scripts/101042: line 62: let: TOTAL=1659388592 - : syntax error: operand expected (error token is "- ")
(standard_in) 2: syntax error
These files look to be about the right size when I compare them (for qualitative reason) so I am going to hazard a guess that the index should be good.
What do you see when you do
tail -n 6 Log.out
in directory above? It should show something like following if the index is complete.It is possible that if you copy/pasted the genome generate command from like a PDF it is possible that hyphens were converted to "smart hyphens" (are you on macOS by chance?).
Yes, I am on macOS and I am surprised when I type period here, it was converted to a question mark.
tail -n 6 Log.out
Number of fastq files for each mate = 1
EXITING because of fatal input ERROR: could not open readFilesIn=Read1
I was asking you for the
tail
output of theLog.out
file for index creation. Looks like you probably ran the alignment in the same directory so the output must have got overwritten with one for the alignment.Looks like your input file is not in the same directory, or has the proper path or has the correct name. Which of the three is an issue?
Sorry for misunderstanding your request. Here is the output of Log.out file for index creation.
Looks like your index is OK. So the issue you are having with alignments should not be related to the index.
Thank you so much for your help! Would you have any recommendations for me to fix the alignment issue?
Is this error the same referred to in your prior thread?
While it is not recommended you could simply type the STAR command out on the login/head node prompt and see if job starts (running it interactively). Be ready to kill the job (ctrl + C) so it does not actually continue running. Once you know the command works (i.e. it does not generate any errors), you can simply copy/paste it in your job submission script. This will help you debug the issue with file paths etc.
You answered the title question. Yes, it is. Because the output files after alignment are 0 sizes so converting from a wrong sam file to a bam file doesn't make any sense. Do I need to create a new thread?
Did you try running the STAR command directly on the terminal prompt as I suggested above?
Aug 09 12:43:41 ..... started STAR run
Aug 09 12:43:41 ..... loading genome
Aug 09 12:45:05 ..... started mapping
Ok what ever this command line is it seems to be working. Kill this job and then copy this command into your job submission script.
GenoMax I open the file created from the submitted job and got this:
cat: /tmp/101168.1.all.q/machines: No such file or directory
EXITING because of fatal input ERROR: could not open readFilesIn=Read1
Aug 09 17:53:27 ...... FATAL ERROR, exiting /usr/global/sge/default/spool/fenn03/job_scripts/101168: line 76: --runMode: command not found /usr/global/sge/default/spool/fenn03/job_scripts/101168: line 77: --genomeDir: command not found /usr/global/sge/default/spool/fenn03/job_scripts/101168: line 78: --readFilesIn: command not found /usr/global/sge/default/spool/fenn03/job_scripts/101168: line 79: --outSAMtype: command not found
The error when run STAR:
line 69: let: TOTAL=1660095741 - : syntax error: operand expected (error token is "- ") (standard_in) 2: syntax error