Identifying The Mismatched Bases And Their Locations On The Chromosome In .Sam File
2
0
Entering edit mode
11.6 years ago
soosus ▴ 90

How do I find the mismatched bases in these two segments:

962_1394_1642 163    chr1   156    1      25M10H =      329    197       CTAACCCTAACCCTAACCCTAACCT  IIIIIIIIIIIIII,,IICEI((5/  NH:i:3       RG:Z:20110729204302219     CS:Z:T22301002301002331002303022302003000       CQ:Z:0B=@B.=7BB1=;@B,5?.60;(++%3)'%%)-;%       SM:i:0 CM:i:2 NM:i:0 MD:Z:25
1517_881_880 147    chr1   159    22     31M4H  =      349    215       ACCCTAACCCTAACCCTAACCTAACAATAAC  IIIIIIIIIDIIIH=CIIG@:')G)''&13,  NH:i:1       RG:Z:20110729204302219     CS:Z:T31002301002301002301022011033011.20       CQ:Z:?9=6>;3>=-8?;54*:<;-4')355)'&,(%!%(       SM:i:4 CM:i:3 NM:i:2       MD:Z:25C0C4
sam • 4.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
1
Entering edit mode
ADD COMMENT
0
Entering edit mode
11.6 years ago

If you to find out the mismatches against a reference genome then you can pull out the corresponding region 'chr1:156-(156+25)' of the reference genome and then perform a sequence alignment. Do the same thing with the second segment. In case you need to compare these two segments with each other then you can simply take the sequence and do the sequence alignment.

Here is the tool for sequence alignment:

http://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE_TYPE=BlastSearch&PROG_DEF=blastn&BLAST_PROG_DEF=blastn&SHOW_DEFAULTS=on&BLAST_SPEC=GlobalAln&LINK_LOC=BlastHomeLink

BTW, the first sequence has NH:i:3 (multiple hits) so it may or may not belong to the region mentioned in the bam file.

ADD COMMENT

Login before adding your answer.

Traffic: 1797 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6