Bwa Mapping Qualities Of 0
2
3
Entering edit mode
11.1 years ago
huskerjeff492 ▴ 160

I have some reads from a targeted capture kit that bwa gives a mapping quality of 0. I know (probably) the reads are mapped correctly because they do indeed map to the captured gene. Also when I blast the 101bp read they only map to then gene they should map to. The problem comes when I go to call variants because GATK will throw out because the mapping quality is so 0, I could try using -rf ReassignOneMappingQuality -RMQF 255 -RMQT 60 with the unified genotyper but I don't think this ideal.

What I would really like to do get a more descriptive mapping quality score since I'm pretty sure its mapped correctly. However I cant really find documentation on the web about mapping qualities of 0.

Since my blast only returned 1 position for my 101bp read, I am assuming the mapping quality score doesn't really come from the entire read, I am thinking perhaps the seed is the main driver of the mapping score and the seed must map to multiple locations in the genome but the entire read does not? Does anyone know if I just increase the seed size will that work? Can I even increase the seed size with bwa sampe?

Thanks for the help

bwa • 17k views
ADD COMMENT
0
Entering edit mode

I should have stated that I realize mapping qualities of 0 means the read maps to multiple locations but I don't think that is true because when I blast the read via NCBI it only maps to my captured gene, it does not map to multiple genomic locations. I realize I am not blast my actual ref sequence hg19 but it should really matter because if there was a location that was a very close match it should have come up on the ncbi blast.That is why I was asking about the seed? It doesn't make sense to me at why my sequence would not map to multiple blast locations.

ADD REPLY
0
Entering edit mode

see the tag XA in your SAM alignment (last column), does it list the alternative locations that the read was matched against?

ADD REPLY
0
Entering edit mode

also feel free to add the read sequence to your post, it could be an interesting case

ADD REPLY
0
Entering edit mode

Here is the sequence

GGCATGGATGGTCCCCAGGGCCCCAAAGGGAGCTTGGTGAGTGATGGATAGGAGATCCCACCCCCATTCTTATCCCCCGAGGTCCCTGCCAGCATCCTGTTGGCCGCCATTTTGACCCCTGCCTGCTTCCTGCTCTTGCCTTCTTGGCTAT

Here is the sam record --as you can see the sequence is the reverse complement.

M00386:31:000000000-A4L4V:1:1109:13695:12995    83      chr6    33145324        0       151M    =       33145323        -152    ATAGCCAAGAAGGCAAGAGCAGGAAGCAGGCAGGGGTCAAAATGGCGGCCAACAGGATGCTGGCAGGGACCTCGGGGGATAAGAATGGGGGTGGGATCTCCTATCCATCACTCACCAAGCTCCCTTTGGGGCCCTGGGGACCATCCATGCC    :9;<<<:<==?ADFDEEBGGFEEEFGHEEGGDDEDE1HEEECEDF@EGEBEDGFDDCDFFFEGHED?C3DECACCEECEBDFEDDCCCCCGEEDEACECFEBBCEFDAE>ECEACFCDDDBCEDCACBBCEDDEDCCCBCBFCCCFCC?AB    X0:i:3  X1:i:4  BD:Z:MMNJNMJKKJJLNMKLKOOOLKLKOONLMONKHHLNKMDDLMMLLJLKNMMLNKKKMLONMLONKHKOLLKNJHHHKKNJKLKLMMHHHLKMHKKKJKKLMNKLNMKKKPKKKKNMKOOKKHLKDKMHHLKHLNMHHKPLONLMPPPLJJJ       MD:Z:151   PG:Z:MarkDuplicates     RG:Z:Default    XG:i:0  BI:Z:NKMKPNKIGEDFIJKNOQQQOOPPQRPOPQPNMMPQQPLLQQOPMMPMPPPQPOOOQOQPOPPPMLNQONMNLMLLNOQOONNPPOLLLONNLNNNLLMNNOMMOPMOPQLOONNNNOPLLKMNKNNKKNLKNONKKNPOOPMMOQQMKLL       AM:i:0  NM:i:0  SM:i:0  XM:i:0  XO:i:0  MQ:i:0  XT:A:R

I dont see the XA column but if I am following the SAM spec. the read does have a mapping quality of 0

ADD REPLY
0
Entering edit mode

it does have the XT:A:R tag that indicates reads with multiple alignments see ignore reads in bwa

it should also have an XA tag that actually lists these positions - but it also seems that you have passed this data through other steps and it is not a direct output from bwa

ADD REPLY
0
Entering edit mode

Hello huskerjeff492, you should notice the following notes:

(1) bwa mem will ouput the XA column only if there were less than 5 hits by default (can be changed by the -h parameter). It is impossible to list all the alignments if there were thousands hists.

(2) according to this sequence, it really has only one hit in the human genome. However, I'm not sure whether it is the raw bwa output. You should turn to the raw data, get a batch of reads (for example, 100 or 1000) with MapQ=0, and then BLAST them. I've tried ~160 reads with MapQ=0 in my BAM file, all of which has >5 multiple reads

ADD REPLY
8
Entering edit mode
11.1 years ago

A mapping quality of zero in bwa means that the read maps to multiple locations with the same quality and that the mapper has picked one of these positions at random.

ADD COMMENT
1
Entering edit mode
11.1 years ago
Jordan ★ 1.3k

A duplicate of this question What does the zero mapping quality mean for the BWA mapper?

ADD COMMENT

Login before adding your answer.

Traffic: 2364 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6