How to identify with Tophat maximum number of uniquely aligned reads by allowing mismatches only conditionally?
1
0
Entering edit mode
10.3 years ago
trakhtenberg ▴ 160

In identifying unique reads, if tophat alignment is set to allow mismatches, I assume, that a unique read with single perfect alignment may be tagged as having multiple alignments due to a mismatch acceptance. On the other hand, if tophat alignment is set to disallow any mismatches, even the reads which have single unique alignment with one mismatch will get excluded. Is it possible to set tophat parameters so that only if a read has 0 alignments, then to allow 1 mismatch, if this still yields 0 alignments, then allow 2 mismatches, etc. (until x maximum mismatches to accept is reached)? Or, is this best accomplished after the alignment is made, by filtering the output files (e.g., by alignment quality scores) prior to passing to Cufflinks? Either way, how to accomplish this? Thanks.

RNA-Seq • 3.7k views
ADD COMMENT
4
Entering edit mode
10.3 years ago

By default TopHat reports best or primary alignments based on alignment scores (AS). So even if there are more than one alignments that fulfill a certain mapping criteria given by a user it will only report the best alignment. In other words, use default settings during alignment and Tophat should be smart enough to report alignments you are looking for. If you set pretty stringent parameters like no mismatches allowed then you will loose lot of valid alignments. Use default if you don't know how Tophat works.

ADD COMMENT
0
Entering edit mode

So the accepted_hits.bam will contain single best alignments, and multi-aligners only if they have the same AS? Thank you.

ADD REPLY
1
Entering edit mode

Yes. In case more than one alignments have the best AS (alignment scores), Tophat2 will report all of them. But the default setting for --max-multihits is 20 which means if there are 30 alignments with all of them having best AS, then Tophat2 will report 20 of them randomly. If there are only 5 alignments with best AS, then all of them will be reported. The alignments with the second best AS (alignment scores) won't be reported until you use --report-secondary-alignments feature.

ADD REPLY
0
Entering edit mode

It's all clear now, thank you, but I have a follow-up question: When the MAPQ score 4 (single best alignment) is assigned, is it taking into consideration both reads in the pair, so that even if each on its own is a multi-aligner, as a pair they may be unique? (and so then each of these reads would get MAPQ 4 even if each on its own is a multi-aligner)

ADD REPLY

Login before adding your answer.

Traffic: 1846 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6