How Does Tophat Deal With Low Quality Bases?
2
1
Entering edit mode
12.4 years ago
pinkiii1984v ▴ 20

Hi,

I am dealing with RNA-seq PE data. I see some of my samples have poor quality bases at the end of reverse reads. How does TopHat deal with such reads? Are these bases clipped while mapping? and how does that affect the mapping quality?

Thank you

tophat rna-seq • 4.4k views
ADD COMMENT
0
Entering edit mode

AFAIK tophat does not perform soft or hard clipping. So, you'll get better read mapping if you clip the low-quality bases yourself. The way I do it is to perform clipping from the end of the read and retain the read if the clipped read length is >= 50 bases, else remove it.

ADD REPLY
1
Entering edit mode
12.4 years ago

Az Arum puts it (and that should be an answer rather than a comment ;-) ) very few tools use qualities directly. Rightly so I might add since the way the base quality measures are generated lacks proper foundation - at least with respect to the numerical probabilities they stand for.

Note how a good base quality is 40 that means one in 10,000 chance of being wrong - yet at the same time just about all sequencing platforms introduce about 1 miscall per 100 bases. Trimming back reads from their ends prior to processing is the most common approach.

ADD COMMENT
0
Entering edit mode
12.4 years ago
pinkiii1984v ▴ 20

In case if I don't trim the poor quality bases, then how does tophat deal with them?

ADD COMMENT
0
Entering edit mode

please add your contributions as a followup comment rather than as a new answer. As for your question: the quality string is simply ignored,

ADD REPLY
0
Entering edit mode

If TopHat ignores the quality string, will that affect the mapping quality?

ADD REPLY
0
Entering edit mode

I think (but you should check this with the developers) that it ignores the quality during the alignment procedure, but then it does make use of it when computing the mapping quality is computed, at least this is what maq/bwa does: http://maq.sourceforge.net/qual.shtml - an now just a personal opinion - in general don't read too much into the qualities - these as values are rough approximations that have an accuracy that is far less than what is implied

ADD REPLY

Login before adding your answer.

Traffic: 1685 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6