Question

Rnaseq: Read Length Different From Expected

2

Entering edit mode

11.9 years ago

thiesgehrmann ▴ 40

Hello all,

I have received paired-end reads for 40 samples. The reads are supposed to be 100bp per end. Instead, 20 of my samples are 101bp per end, while the other 20 are 100bp as expected.

Because of this, we assume that these 20 samples were all in the same lane; and that by accident there was an extra iteration in the illumina sequencing. However, we also see a strong negative correlation with read length and quality; the samples in which we had 101bp per end lose about 30% of the reads in the trimmomatic quality step.

My question is: Has this happened to anyone else? Does it occur often, and if so, does it often affect quality? We are really quite puzzled by this.

Any ideas / clues appreciated.

Thank you! -Thies Gehrmann

rnaseq quality • 3.7k views

ADD COMMENT • link updated 11.9 years ago by venks ▴ 740 • written 11.9 years ago by thiesgehrmann ▴ 40

0

Entering edit mode

Did you check if the first or last letter is the same for every read?

ADD REPLY • link 11.9 years ago by Raony Guimarães ★ 1.4k

score 1 · Answer 1 · 2013-06-17

1

Entering edit mode

11.9 years ago

venks ▴ 740

I had Illumina 1.9 fastq data which has read lengths of 101bp. If this is right and that is something that @theisgehrmann expects. Then I guess the read quality at last 30 BP might be because of the Universal adapters that are added if the fragment size is small.

Good luck

ADD COMMENT • link 11.9 years ago by venks ▴ 740

score 0 · Answer 2 · 2013-06-17

0

Entering edit mode

11.9 years ago

Istvan Albert 102k

I believe that the sequencing instrument can be operated only for sequencing lengths that are increments of 50bp. So no 101bp sequencing seems possible.

One can however run the bcl->fastq conversion with arbitrary parameters.

Is it possible that someone overrode the information in the setup file and ran these with incorrect barcode lengths? I could see how someone using an automated script generating other conversion scripts entered a one base shorter barcode that in turn affects the the other lengths.

ADD COMMENT • link 11.9 years ago by Istvan Albert 102k

0

Entering edit mode

I guess I'll have to call them up! Thank you!

ADD REPLY • link 11.9 years ago by thiesgehrmann ▴ 40

0

Entering edit mode

50bp increments is true for HiSeq, but not true for e.g. GAII, which does 36, 75 and 100 (http://www.biotech.wisc.edu/facilities/dnaseq/sequencing/Illumina). However, I agree with the basic point that 101 seems unlikely :)

ADD REPLY • link 11.9 years ago by Jelena Aleksic ▴ 920