MAPQ (Mapping quality) of 0 for most reads from BWA-MEM2 (with no secondary alignment or other apparent reason)
0
0
Entering edit mode
3.2 years ago
artemd ▴ 20

Hello,

I got a very weird output from BWA-mem2 - most of the reads have mapping quality of 0, even though there is no secondary alignment or anything else suspicious.

I got sequencing data that was aligned with Novoalign to hg18, the data was bam files. I needed to realign the data to hg19, so I extracted two fastq files using "bedtools bamtofastq" and run a standard alignment using bwa mem2. The commands I used were:

samtools sort -n -12 -o in_file.bam.sortedByName in_file.bam
bedtools bamtofastq -i in_file.bam.sortedByName -fq in_reads.r1.fq -fq2 in_reads.r2.fq
~/tools/bwa2/bwa-mem2-2.0pre2_x64-linux/bwa-mem2 mem  -t 12 hg19.fa in_reads.r1.fq in_reads.r2.fq   |\
    samtools sort -12 -o out_file.bam -

When inspecting the original bam file there was hardly any reads with 0 mapping quality, but in the new bam file from bwa mem2 (using the same reads as in the original bam file) most reads had 0 mapping quality for an unknown reason.

This greatly impacts my next analysis steps which is variant calling since it skips all reads with 0 mapping quality.

An output from the new bam file is attached below.

Any help/suggestions would be appreciated.

HWI-ST218:512:C7GP4ANXX:1:2115:10195:61746#GCGGAC       145     chr1    17030   0       5S95M   =       16819   -306    NNNNNGGCACATAGAAGTAGTTCTCTGGGACCTGCAAGATTAGGCAGGGACATGTGAGAGGTGACAGGGACCTGCAGGGGCAGCCAACAAGACCTTGTGT    F</F/FFFFBFB<FFFFFFFFFFBBFFB<BFFFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFF<FFFFFFFFFFFFFFFFFFFBBBBBB    NM:i:0  MD:Z:95 AS:i:95 XS:i:95
HWI-ST218:512:C7GP4ANXX:1:2313:14727:77282#GGAACT       147     chr1    17031   0       8S92M   =       16971   -152    NNNNNNNCGCACATAGAAGTAGTTCTCTGGGACCTGCAAGATTAGGCAGGGACATATGAGAGGTGACAGGGACCTGCAGGGGCAGCCAACAAGACGTTGT    F<7B/7<//BFFF</B/FF<BF<<FFFFFFFFFFFFFFFFFFFFFF<FFFB<<FB/FFFFFFFFFFFFFFFFFFFFF/FFFFFBB</FFFFFFBFBBBB/    NM:i:2  MD:Z:47G39C4    AS:i:82 XS:i:82
HWI-ST218:512:C7GP4ANXX:1:2303:3877:27572#GCGGAC        163     chr1    17038   0       100M    =       17094   156     GAAGTAGTTCTCTGGGACCTGCAAGATTAGGCAGGGACATGTGAGAGGTGACAGGGACCTGCAGGGGCAGCCAACAAGACCTTGTGTGCACCTCCCATGG    BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:1307:10764:27549#TGACAT       163     chr1    17067   0       100M    =       17146   179     GGCAGGGACATGTGAGAGGTGACAGGGACCTGCAGGGGCAGCCAACAAGACCTTGTGTGCACCTCCCATGGGTGGAATAAGGGGCCCAACAGCCTTGACT    BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:1213:19344:15127#CGTACG       147     chr1    17074   0       100M    =       17010   -164    ACATGTGAGAGGTGACAGGGACCTGCAGGGGCAGCCAACAAGACCTTGTGTGCACCTCCCATGGGTGGAATAAGGGGCCCAACAGCCTTGACTGGAGAGG    FFFFBFFFFFFBFFFFFFFBFFFFFFFBFFFFFFFFFFFF<FFBFF<FFFFFFFFBFFFFFFFFFFBBFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:2308:10967:94378#TTGACT       83      chr1    17074   0       100M    =       16996   -178    ACATGTGAGAGGTGACAGGGACCTGCAGGGGCAGCCAACAAGACCTTGTGTGCACCTCCCATGGGTGGAATAAGGGGCCCAACAGCCTTGACTGGAGAGG    FFFFF<F<///FFF<<FBBB<FFBFFFFFFFFFFFFFBFFFF<FFFBFFFBF<FFB<BFFFFFFFFFFBF<FBFFFFFFFFFFFFFFFBFFFFFFBBBBB    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:2303:3877:27572#GCGGAC        83      chr1    17094   0       100M    =       17038   -156    NNNNGCAGGGGCAGCCAACAAGACCTTGTGTGCACCTCCCATGGGTGGAATAAGGGGCCCAACAGCCTTGACTGGAGAGGAGCTCTGGCAAGGCCCTGGG    BFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB    NM:i:4  MD:Z:0A0C0C0T96 AS:i:96 XS:i:96
HWI-ST218:512:C7GP4ANXX:1:1307:10764:27549#TGACAT       83      chr1    17146   0       100M    =       17067   -179    AGGGGCCCAACAGCCTTGACTGGAGAGGAGCTCTGGCAAGGCCCTGGGCCACTGCACCTGTCTCCACCTCTGTCCCACCCCTCCCACCTGCTGTTCCAGC    FFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:2307:19743:14016#CTCTAC       163     chr1    17150   0       100M    =       17267   217     GCCCAACAGCCTTGACTGGAGAGGAGCTCTGGCAAGGCCCTGGGCCACTGCACCTGTCTCCACCTCTGTCCCACCCCTCCCACCTGCTGTTCCAGCTGCT    BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:1101:13442:62440#GGCCAC       99      chr1    17159   0       100M    =       17217   158     CCTTGACTGGAGACGAGCTCTGGCAAGGCCCTGGGCCACTGCACCTGTCTCCACCTCTGTCCCACCCCTCCCACCTGCTGTTCCAGCTGCTCTCTCTTGC    BBBBBFFFFFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFBFFFFFFF<FFFBFFFFFFF    NM:i:1  MD:Z:13G86      AS:i:95 XS:i:95
HWI-ST218:512:C7GP4ANXX:1:2301:14412:20516#GCGGAC       99      chr1    17209   0       100M    =       17283   174     CCACCTCTGTCCCACCCCTCCCACCTGCTGTTCCAGCTGCTCTCTCTTGCTGATGGACAAGGGGGCATCAAACAGCTTCTCCTCTGTCTCTGCCCCCAGC    BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFB/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:2303:18559:70135#TGACAT       99      chr1    17209   0       100M    =       17275   166     CCACCTCTGTCCCACCCCTCCCACCTGCTGTTCCAGCTGCTCTCTCTTGCTGATGGACAAGGGGGCATCAAACAGCTTCTCCTCTGTCTCTGCCCCCAGC    BBBBBFFFBFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFF<FFFFF/FFBFFFF/BBFB//BB<BFFFFFFFFFFFFFFFFFF/FFBFFFFFFBFBFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:1304:5501:35478#TTGACT        81      chr1    17311   0       100M    chr9    17178   0       CACATGGGTCTTTGTTACAGCACCAGCCAGGGGGTCCAGGAAGACATACTTCTTCTACCTACAGAGGCGACATGGGGGTCAGGCAAGCTGACACCCGCTG    #BFFF/BFFBF7<7BBFFFFFBF<FB<FFB<//BFFFFFB</FFFFFFFFFFFBBFF<FFFF<FFFFFFFFFFFFFFFFFFFFFFFFFBFBFFFB/<<B<    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:1214:20018:83329#CCACTC       163     chr1    17321   40      100M    =       17412   184     TTTGTTACAGCACCAGCCAGGGGGTCCAGGAAGACATACTTCTTCTACCTACAGAGGCGACATGGGGGTCAGGCAAGCTGACACCCGCTGTCCTGAGCCC    BBBBBFFFFFFFFFFFFFFFFFFFFFF/FF<FFFFFFFFFFFFFFFFFFFFFFFFF<FF<FFF<FBFFFFFFFFBFFB/FBFBBF###############    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:2108:2814:58060#CCACTC        163     chr1    17321   40      100M    =       17412   184     TTTGTTACAGCACCAGCCAGGGGGTCCAGGAAGACATACTTCTTCTACCTACAGAGGCGACATGGGGGTCAGGCAAGCTGACACCCGCTGTCCTGAGCCC    BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:1107:6789:55675#CGTACG        163     chr1    17346   40      100M    =       17402   156     CCAGGAAGACATACTTCTTCTACCTACAGAGGCGACATGGGGGTCAGGCAAGCTGACACCCGCTGTCCTGAGCCCATGTTCCTCTCCCACATCATCAGGG    BBBBBFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF    NM:i:0  MD:Z:100        AS:i:100        XS:i:100
HWI-ST218:512:C7GP4ANXX:1:2102:6821:92451#CTCTAC        129     chr1    17370   0       100M    chr15   102513436       0       TACAGAGGCAACATGGGGGTCAGGCAAGCTGACACCCGCTGTCCTGAGCCCATGTTCCTCTCCCACATCATCAGGGGCACAGCGTGCACTGTGGGGTCCC    BBBBBFFFFBBFBF<FBFFBFFFFFFFFFFFFFFFFFFFBFFFFFFFFFFFB<BFFFFFFBFFFFFFFFFFFFFFB<B<FFFFFBFFFFFBFBFBBFBFF    NM:i:1  MD:Z:9G90       AS:i:95 XS:i:95
HWI-ST218:512:C7GP4ANXX:1:1107:6789:55675#CGTACG        83      chr1    17402   45      100M    =       17346   -156    CACCCGCTGTCCTGAGCCCATGTTCCTCTCCCACATCATCAGGGGCACAGCGTGCACTGTGGGGTCCCAGGCCTCCCGAGCCGAGCCACCCGTCACCCCC    <7<B7/BFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBFFFFBFBBBBB    NM:i:0  MD:Z:100        AS:i:100        XS:i:92
HWI-ST218:512:C7GP4ANXX:1:1202:2230:41463#CGAAAC        81      chr1    17412   21      7S93M   chr9    17270   0       NNNNNNNCCTGAGCCCATGTTCCTCTCCCACATCATCAGGGGCACAGCGTGCACTGTGGGGTCCCAGGCCTCCCGAGCCGAGCCACCCGTCACCCCCTGG    FFFFFF<FFFFBFFFFFBFFBFF<F/FFFBB<//7<<BB/FBFFFFFFBFFFFFFFBFFFFFFFBBFBFFBFFFFBFFBFFFFFFFFFFFFFFFFBBBBB    NM:i:0  MD:Z:93 AS:i:93 XS:i:85 XA:Z:chr15,+102513657,12M2D81M7S,2;chrX,+155252226,12M2D81M7S,2;chrY,+59355232,12M2D81M7S,2;
HWI-ST218:512:C7GP4ANXX:1:1214:20018:83329#CCACTC       83      chr1    17412   45      7S93M   =       17321   -184    NNNNNNNCCTGAGCCCATGTTCCTCTCCCACATCATCCGGGGCACAGCGTGCACTGTGGGGTCCCAGGCCTCCCGAGCCGAGCCACCCGTCACCCCCTGG    FFFFFBBFFBFFFFFBFFFFFFFFFFBFFFFFFBB/</FFFFFFFFFFFFFFFBFFFFFF<FFFFFFFFFF<FBFFFFBF/BFFFFFFFFFFFFFBBBB<    NM:i:1  MD:Z:30A62      AS:i:88 XS:i:80
HWI-ST218:512:C7GP4ANXX:1:2108:2814:58060#CCACTC        83      chr1    17412   45      7S93M   =       17321   -184    NNNNNNNCCTGAGCCCATGTTCCTCTCCCACATCATCAGGGGCACAGCGTGCACTGTGGGGTCCCAGGCCTCCCGAGCCGAGCCACCCGTCACCCCCTGG    FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFBBBBB    NM:i:0  MD:Z:93 AS:i:93 XS:i:85
HWI-ST218:512:C7GP4ANXX:1:1214:4075:22725#GGAACT        163     chr1    17431   41      100M    =       17507   176     CCCACATCATCAGGGGCACAGCGTGCACTGTGGGGTCCCAGGCCTCCCGAGCCGAGCCACCCGTCACCCCCTGGCTCCTGGCCTATGTGCTGTACCTGTG    BBBBBBFFFFFFFFFFFBFFFFFFFF<FFFFFFFFFFFF<FF<BFFFFFFFFFFFFFFFFF<BFFFB/FFFFFFFF<FFFFFFFFFFFFFFFFFF#####    NM:i:0  MD:Z:100        AS:i:100        XS:i:92 XA:Z:chrX,-155252200,38M2D62M,2;chrY,-59355206,38M2D62M,2;chr15,-102513631,38M2D62M,2;chr9,+17542,62M2D38M,3;chr12,-88093,38M2D62M,4;

Flagstat from the old bam file:

81003600 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
0 + 0 supplementary
0 + 0 duplicates
79673553 + 0 mapped (98.36% : N/A)
81003600 + 0 paired in sequencing
40501800 + 0 read1
40501800 + 0 read2
78475118 + 0 properly paired (96.88% : N/A)
79450142 + 0 with itself and mate mapped
223411 + 0 singletons (0.28% : N/A)
513576 + 0 with mate mapped to a different chr
504939 + 0 with mate mapped to a different chr (mapQ>=5)

Flagstat from the new bam file:

81128792 + 0 in total (QC-passed reads + QC-failed reads)
0 + 0 secondary
125192 + 0 supplementary
0 + 0 duplicates
81023339 + 0 mapped (99.87% : N/A)
81003600 + 0 paired in sequencing
40501800 + 0 read1
40501800 + 0 read2
74405544 + 0 properly paired (91.85% : N/A)
80839590 + 0 with itself and mate mapped
58557 + 0 singletons (0.07% : N/A)
691704 + 0 with mate mapped to a different chr
531479 + 0 with mate mapped to a different chr (mapQ>=5)
mapping bwa alignment genome sequencing • 2.0k views
ADD COMMENT
1
Entering edit mode

hum reads are always badly aligned on 5' of chr1. What is the output of `samtools flagstats your.bam'

ADD REPLY
0
Entering edit mode

Thanks for your suggestions, the strange thin is that I inspected the old file (which was produced from the same reads) on the same region and I saw only rare reads with 0 MAPQ. I also compared the same reads between the two bam files, and in the old file they had MAPQ>0 and the new file had MAPQ of 0. Anyway, I attach the flagstat from the old and the new files, I hope it will help.

ADD REPLY
0
Entering edit mode

Okay, now I checked a different region (chr3:100000) and I see that all/most reads are properly aligned with high MAPQ. Still strange that the variant caller I used produced an empty file gatk's mutec2.

ADD REPLY
0
Entering edit mode

in the new bam file from bwa mem2 (using the same reads as in the original bam file) most reads had 0 mapping quality for an unknown reason.

so your assertion was wrong isn't it ?

ADD REPLY

Login before adding your answer.

Traffic: 1675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6