Entering edit mode
6.7 years ago
sinumolg
▴
10
Hi
I am doing RNA seq analysis of two bacterial strains. When running tophat for one file it shows the following error. "qual length (30) differs from seq length (81) for fastq record SRR3194957.20971384.2!"
Can anybody help us to fix it out Thanks in advance
You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.
It appears that you have a corrupt fastq file where for at least one record the quality field does not match the number of bases in that record. You may need to re-download the file. I suggest you get the fastq directly from EBI-ENA.