Tophat Rna-Seq Mapping Quality
3
1
Entering edit mode
11.7 years ago
siyu ▴ 150

if a read can be mapped to more than one location in reference genome,so-called mulit-read , what's the read's mapping quality reported by TopHat ??

how about BWA?

thanks so much?

tophat rna-seq mapping quality • 12k views
ADD COMMENT
1
Entering edit mode
11.3 years ago
predeus ★ 2.1k

This actually does not seem to hold upon checking (or at least not anymore). In Tophat 1.4.1, there is no mapping quality of 2. I've looked at a large number of BAM files and it seems that the qualities are as follows:

  • 255 = unique mapping
  • 3 = maps to 2 locations in the target
  • 1 = maps to 3-4 locations
  • 0 = maps to 5 or more locations (up to the number defined in "--prefilter-multihits").

cheers

ADD COMMENT
0
Entering edit mode

Interesting, thanks for doing the legwork!

ADD REPLY
0
Entering edit mode
11.7 years ago

For Tophat:

From http://seqanswers.com/forums/showthread.php?t=10624:

255 = unique mapping
3 = maps to 2 locations in the target
2 = maps to 3 locations
1 = maps to 4-9 locations
0 = maps to 10 or more locations.

I think this is correct, it fits with what I have observed.

ADD COMMENT
0
Entering edit mode

Thank you so much !

ADD REPLY
0
Entering edit mode
11.0 years ago
zju.whw ▴ 70

My tophat output bam file is using 50 to indicate the unique mapping, is there any official materials about the mapping quality of unique mapped reads?

In tophat website, it says that "most of the optional SAM fields (AS, MD, NM, and etc.) generated by Bowtie 2 are now reported by TopHat as well (reconstructed as necessary)".

And in bowtie2 website, it says that "Mapping quality: higher = more unique".

However, I cannot find the exact mapping quality number for unique mapped reads

ADD COMMENT
0
Entering edit mode

As mentioned above the unique mapping should be scored 255. I don't know why your BAM file has score of 50 for unique alignments. Did you generate this BAM file or you got it from someone. Sometimes people cap the maximum mapping quality to 50 because some downstream analysis softwares throw error if they find a mapping score of 255.

ADD REPLY
0
Entering edit mode

I don't know why. The BAM file is got by myself, using the default tophat paramters. However, I have not found any side-effect so far. Thank you for your reply.

ADD REPLY
0
Entering edit mode

I've never fully trusted the mapping quality scores, so when I am interested in uniquely mapped reads I filter based on the SAM tags. I know that for TopHat v1.4.1 onwards (I haven't looked at earlier versions), the number of mappings for each read is reported in the "NH" tag, so you can write a simple script to only keep reads with this tag set to 1.

ADD REPLY
0
Entering edit mode

Yes, you're right. There are at least two ways to get the uniquely mapped, NH:i:1 tag and mapping quality scores. I have made a mistake that I thought the NH:i:1 is in a constant column. But it is actually not. So it takes more time to identify the column number of the NH:i tag. And the bam file should be conversed to sam file, and converse back after filtering. On the other hand, you can use samtools view -bq to get the unique mapped reads. It is faster and easier. What is your scripts? What is your opinion

ADD REPLY
0
Entering edit mode

I agree that you can get uniquely reads quicker using samtools, assuming that you are confident in the mapping qualities. I was looking at metrics beyond counting/extracting the uniquely aligned reads, and I was using output from several different aligners, so I wrote a perl script for this task. This script is available as part of a software package at http://sourceforge.net/projects/rnaseqvariantbl/ (see extract_unique.pl).

ADD REPLY
0
Entering edit mode

thank you very much

ADD REPLY

Login before adding your answer.

Traffic: 2009 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6