Calculating Average Read Quality
1
2
Entering edit mode
12.4 years ago
hadasa ★ 1.0k

I would like to calculate the accuracy of 454 reads. I have taken an approach that involves summing the quality scores for each read and dividing by the length of the read. This will obscure regions that may have low or high quality scores.( at the moment It might not matter) Is this a good approach for getting the accuracy of a particular read? what do you suggest as a better alternative?

sequencing read quality 454 • 4.5k views
ADD COMMENT
4
Entering edit mode
12.4 years ago
Arun 2.4k

One method I could think of is winsorisation to overcome the disadvantage due to mean's dependence on extreme values. You can find an implementation of winsorisation function in R here.

Also, you could have a look at the scaling normalization method for RNA-Seq paper from Robinson et al., that implements TMM (which is mean on data that is trimmed off x% of extreme values). I guess the basic idea is the same but it is rigorous enough to gain perspective. Other than that, I believe a simple winsorisation should be sufficient. Median is another alternative.

ADD COMMENT

Login before adding your answer.

Traffic: 1666 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6