STAR aligner FATAL ERROR not correct

1

Entering edit mode

2.9 years ago

Mike ▴ 20

Hi, I downloaded a bunch of rna reads (mostly single-ends) and tried to align them to a reference genome using STAR aligner (without trimming!!).

Initially I got this error:

EXITING because of FATAL ERROR in reads input: quality string length is not equal to sequence length
@SRR9434783.1              
              TGGGAAATGACCCTCC..............

So I checked my fastq.gz file to see whether the reads had a sequence length that does not match the quality value string:

using zgrep -B4 -A8 "@SRR9434783.1" SRR9434783_1.fastq.gz I got tons of results, the first one:

enter image description here

The lengths are the same (701)

I downloaded the SRX's from NCBI. Is there a solution?

Thanks a lot.

fastq star • 1.1k views

ADD COMMENT • link updated 18 months ago by Ram 44k • written 2.9 years ago by Mike ▴ 20

2

Entering edit mode

It seems your fastq data is very strange. The quality characters are not encodes following the widely-used Phred33 rule. So possibly STAR cannot recognize your quality line correctly. And, as far as I know, a read with 700bp is too long for next generation sequencing platform, like illumina. I wonder that your data is not a typical NGS data so that STAR cannot deal with it.

ADD REPLY • link 2.9 years ago by shiyang_bio ▴ 170

0

Entering edit mode

You were right, STAR's author replied that it does not work well with long reads, he suggests using minimap2 or STARlong

ADD REPLY • link 2.9 years ago by Mike ▴ 20

Login before adding your answer.