Dear frineds, i wanna ask how can i know the mismatch from bowtie2 output
1
0
Entering edit mode
8.0 years ago
biologo ▴ 40

dear friends:

i wanna know how to know the mismatch for the bowtie2 output??? when i run the bowtie, the sam output would do like this:(and tell you the base conversion of each reads, also the site)

J00142:71:HGL3TBBXX:8:1101:12540:1666 2:N:0:1   +       chr17   8773052 GCACACGGCTGAACGCCAG     <FJJJ7FAJF7<<FFAA-7     0       12:C>A
J00142:71:HGL3TBBXX:8:1101:2717:1666 2:N:0:1    -       chr2    33141441        GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGAA      JJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJFFAAAJAA<77-JJJJJJJJJJJJFJJJJFF7JJJJAA7A77<<-<     51:G>A,32:A>G,72:A>G

but bowtie2 generate like this: (how can i know the base conversion from it or can you help me explain the meaning of the tag MD:Z:21A41A6T12T0T0T0)

J00142:71:HGL3TBBXX:8:1101:1164:1683    16      chrY    10037829        3       49M1I37M        *       0       0       TGTGAATTGCAGGACACATTGCTCATCGACACTTCGAACGCACTTGCGGCCCCGGGTTCCTCCCGGGGCTACGCCTGTCTGAGCGCC JAJ7A7AA7-7AFA7J-AF-F-JFFJF<-F-JJFJFFJFAJJF7FJ<FA7-JA<AFFA<FFJAJJJJFFJJ<FJJJJJFFJAF<AAA AS:i:-36        XN:i:0  XM:i:6  XO:i:1  XG:i:1  NM:i:7  MD:Z:21A41A6T12T0T0T0   YT:Z:UU

thank you so much!!!

RNA-Seq • 1.8k views
ADD COMMENT
2
Entering edit mode

If I understad the question right
from this value

49M1I37M
  1. you have 49 mismatch
  2. followed by 1 inseration
  3. then 37 mismatch

more could be found here point 1.4 section 6

regarding the explanation

The MD tag is for SNP/indel calling without looking at the reference

more info

ADD REPLY
1
Entering edit mode

I am very grateful for your kind help!!! thank you

ADD REPLY
1
Entering edit mode

I find the MD tags to be a pain to deal with. Fortunately, the current Sam spec allows you to generate sam files with cigar strings using X to indicate mismatch and = to indicate match. If you map your reads with BBMap, it will do this by default, then you don't have to look at the MD tag. But for your existing mapped reads, you can also convert them with BBMap's Reformat utility, like this:

reformat.sh in=mapped.sam out=better.sam sam=1.4

That will change reads with a cigar string like "100M" to something like "20=1X79=" which clearly indicates there is a mismatch at position 21.

ADD REPLY
3
Entering edit mode
8.0 years ago
michael.ante ★ 3.9k

Hi biologo,

you can find the specification of the MD tag here: SAMtags.pdf .

MD:Z:21A41A means, you have 21 matches followed by a mismatch with an A in the reference. The next 41 bases are matches again, followed by a mismatch with an A in the reference.

I hope this will help for the start.

Cheers,

Michael

ADD COMMENT
0
Entering edit mode

You really help me a lot, there is little information about MD:Z tag, thank you again!!!

ADD REPLY

Login before adding your answer.

Traffic: 1982 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6