Dear All,
I am currently analyzing mouse RNASeq samples. I used Tophat2.1.1 version, cufflinks2.2.1 version for my analysis and I downloaded the latest version of mm10 from Igenomes (genome and GTF). I am facing an issue at following steps
Cuffmerge step:
I noticed following error messages at cuffmerge log file. But the merged file (merged.gtf) is generated in the output directory.
Error (GFaSeqGet): end coordinate (117274415) cannot be larger than sequence length 115169878
Error (GFaSeqGet): end coordinate (117981028) cannot be larger than sequence length 115169878
...
Error (GFaSeqGet): end coordinate (85529519) cannot be larger than sequence length 59373566
Cuffdiff step:
All the output files in cuffdiff directory are empty. Then I checked the log file from cuffdiff, I noticed following error messages.
Error (GFaSeqGet): end coordinate (117135884) cannot be larger than sequence length 115169878
.....
Error (GFaSeqGet): end coordinate (61176309) cannot be larger than sequence length 59128983
Error (GFaSeqGet): end coordinate (61228418) cannot be larger than sequence length 59128983
This contig will not be bias corrected.
Warning: couldn't find fasta record for 'chrUn_JH584304'!
This contig will not be bias corrected.
GffObj::getSpliced() error: improper genomic coordinate 3078823 on chrX for TCONS_00034613
Dear Devon Thanks for getting back to me. I really appreciate it.
Sure I will check the file as you mentioned. I used following commands in my analysis
Tophat command:
Cufflinks command:
Cuffmerge command:
Dear Devon,
When I used Ensembl mouse GTF file, I didn't encounter this error. But with the UCSC mouse GTF file, I faced the same kind of error for the different project dealing with the mouse.
Another colleague in my team did the RNAseq analysis using UCSC mouse GTF file. He generated some output files. For him, the gene_expdiff output has 1400 significant genes. But for me, when I redo the analysis using Ensembl mouse GTF file, I got 600 genes only. I couldn't check with him now. He moved to a different place. Is the difference in genes due to Ensembl GTF file?
Possibly, it's impossible to say without knowing exactly what was done before.