How can I drop the MAPQ info of my sam/bam file?
1
0
Entering edit mode
3.5 years ago
francois ▴ 80

For example, the first alignment in my SAM file currently looks like

08b20551-b31c-4170-a04c-40a34ee8b71f    0   chr20   4699371 60  272H123M637H    *   0   0   CCTCAGGGCGGTGGTGGCTGGGGGCAGCCTCATGGTGGTGGCTGGGGGCAGCCTCATGGTGGTGGCTGGGGGCAGCCCCATGGTGGTGGCTGGGGACAGCCTCATGGTGGTGGCTGGGGTCAA &'$'()+2220/,+.203:=?65<?BDDCF:B@>69;AF?FLA=?CB>=83/0-@@?=>@A52.;AA@CGA45=DFHG@?A?;<9A:A>=:=8<=@=B56B:@@BCHFHG?4@<::=98=:79 ms:i:1920   AS:i:1920   nn:i:0  tp:A:P  cm:i:158    s1:i:929    s2:i:0  de:f:0.0148 rl:i:0  HP:i:2  PC:i:30 PS:i:4688888

I would like to delete the MAPQ (aka qual) component. Expected:

08b20551-b31c-4170-a04c-40a34ee8b71f    0   chr20   4699371 60  272H123M637H    *   0   0   CCTCAGGGCGGTGGTGGCTGGGGGCAGCCTCATGGTGGTGGCTGGGGGCAGCCTCATGGTGGTGGCTGGGGGCAGCCCCATGGTGGTGGCTGGGGACAGCCTCATGGTGGTGGCTGGGGTCAA ms:i:1920   AS:i:1920   nn:i:0  tp:A:P  cm:i:158    s1:i:929    s2:i:0  de:f:0.0148 rl:i:0  HP:i:2  PC:i:30 PS:i:4688888

How may I do this?

bam • 1.0k views
ADD COMMENT
0
Entering edit mode

drop the MAPQ info

You will break SAM format if you do that. Why not ignore it if you are not interested in it? Or did you mean to replace the current value?

ADD REPLY
0
Entering edit mode

Ah, I see. A batch of my samples has * for the MAPQ field, but this file has the full MAPQ info. I need to have the formats to match for downstream analysis.

So yes, replacing with * would work. Even better, actually.

How can I do that?

ADD REPLY
0
Entering edit mode

. MAPQ: MAPping Quality. It equals −10 log10 Pr{mapping position is wrong}, rounded to the nearest integer. A value 255 indicates that the mapping quality is not available.

You have 60 in your example above for MAPQ field number 5. * is for RNEXT field.

See SAM format spec.

ADD REPLY
0
Entering edit mode
3.5 years ago

Looks like you don't want to replace the mapping quality (MAPQ), but the base call quality (QUAL) instead. Mapping quality is 60 in your example.

ADD COMMENT
0
Entering edit mode

Ah you're right, my bad!

I found a suitable solution:

cat in.sam | awk -v OFS='\t' '$11="*"' > out.sam

Not particularly elegant because it affects the header lines as well, but good enough for what I am currently doing...

ADD REPLY

Login before adding your answer.

Traffic: 1862 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6