Rna-Seq: Difference In Read Quality Pattern Between Illumina Ga And Hiseq 2000?
1
2
Entering edit mode
13.2 years ago
Bio_X2Y ★ 4.4k

In the past, we've used an Illumina GA for our RNA-seq experiments. In general, we noticed that the reported quality of the read bases was highest at the 5' end of each read, and the quality dropped gradually towards the 3' end (as per the FASTQ files). This is what we expected.

Recently, however, we've received an RNA-seq dataset generated from a HiSeq 2000, and notice a different pattern. The 5' bases have a high quality, but the quality actually improves in the 3' direction until about base 20 (out of 90), and then drops gradually.

Can someone perhaps comment on whether this alternative pattern is just a harmless artifact of the HiSeq 2000, or if it should be a cause for concern?

Thanks.

rna hiseq illumina quality • 4.0k views
ADD COMMENT
1
Entering edit mode

Just wanted to add that we've also seen the same pattern -- something like a upside-down-smile (aka. a frown), where something like bases 1-4, 5-9, 10-14 increase in a step-like fashion, then a "normal" phred like distro is seen where we have a gradual/slight decrease in scores towards the 3' direction. We're doing 50 bp runs, and the median score out at base 50 is still ~ 36 (out of 40), so ... all in all, it's still quite good for us.

ADD REPLY
1
Entering edit mode

@steve: We also see similar pattern; 1-3, 4-8, 9-10, increase stepwise, then gradual increase upto 50-60bp and then slowly decreases till 3' end. we are running 104bp. but over all read qualities are good (median scores >32).

ADD REPLY
7
Entering edit mode
13.2 years ago

Illumina changed the quality prediction in HCS 1.4 (RTA 1.12) to better model error rates at the 5' ends of the sequence. This tech note describes the change (I couldn't find it on the Illumina website, so the link is to my Dropbox):

http://dl.dropbox.com/u/6634542/RTA_Quality_Predictors_TechNote.pdf

Page 11 of the RTA Theory of Operations tech note has additional useful details:

http://www.illumina.com/Documents/products/technotes/technote_rta_theory_operations.pdf

So the new software is attempting to better model the underlying error rates, as opposed to a fundamental change in 5' sequence quality on the Hi-Seq.

ADD COMMENT

Login before adding your answer.

Traffic: 2633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6