How to convert from a cigar string to extended cigar string?
2
I have a cigar string and the MD tag of the corresponding read record and I want to get the extended cigar string. Is there any java/c/c++ code or library that allows me to do that?
cigar
cpp
bam
extended-cigar
sam
• 4.4k views
reformat.sh
from BBMap suite .
reformat.sh in= your.bam out= new.bam sam= 1.4
I wrote samfixcigar : http://lindenb.github.io/jvarkit/SamFixCigar.html
$ cat toy.sam
@SQ SN:ref LN:45
@SQ SN:ref2 LN:40
r001 163 ref 7 30 8M4I4M1D3M = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112
r002 0 ref 9 30 1S2I6M1P1I1P1I4M2I * 0 0 AAAAGATAAGGGATAAA *
r003 0 ref 9 30 5H6M * 0 0 AGCTAA *
r004 0 ref 16 30 6M14N1I5M * 0 0 ATAGCTCTCAGC *
r003 16 ref 29 30 6H5M * 0 0 TAGGC *
r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT *
x1 0 ref2 1 30 20M * 0 0 aggttttataaaacaaataa ????????????????????
x2 0 ref2 2 30 21M * 0 0 ggttttataaaacaaataatt ?????????????????????
x3 0 ref2 6 30 9M4I13M * 0 0 ttataaaacAAATaattaagtctaca ??????????????????????????
x4 0 ref2 10 30 25M * 0 0 CaaaTaattaagtctacagagcaac ?????????????????????????
x5 0 ref2 12 30 24M * 0 0 aaTaattaagtctacagagcaact ????????????????????????
x6 0 ref2 14 30 23M * 0 0 Taattaagtctacagagcaacta ???????????????????????
$ java -jar dist/samfixcigar.jar \
-r samtools-0.1.19/examples/toy.fa \
samtools-0.1.19/examples/toy.sam
output:
@HD VN:1.4 SO:unsorted
@SQ SN:ref LN:45
@SQ SN:ref2 LN:40
r001 163 ref 7 30 8= 4I4= 1D3= = 37 39 TTAGATAAAGAGGATACTG * XX:B:S,12561,2,20,112
r002 0 ref 9 30 1S2I6= 1P1I1P1I1X1= 2X2I * 0 0 AAAAGATAAGGGATAAA *
r003 0 ref 9 30 2= 1X3= * 0 0 AGCTAA *
r004 0 ref 16 30 6= 14N1I5= * 0 0 ATAGCTCTCAGC *
r003 16 ref 29 30 5= * 0 0 TAGGC *
r001 83 ref 37 30 9= = 7 -39 CAGCGCCAT *
x1 0 ref2 1 30 16= 1X3= * 0 0 AGGTTTTATAAAACAAATAA ????????????????????
x2 0 ref2 2 30 15= 1X3= 1X1= * 0 0 GGTTTTATAAAACAAATAATT ?????????????????????
x3 0 ref2 6 30 9= 4I13= * 0 0 TTATAAAACAAATAATTAAGTCTACA ??????????????????????????
x4 0 ref2 10 30 1X3= 1X20= * 0 0 CAAATAATTAAGTCTACAGAGCAAC ?????????????????????????
x5 0 ref2 12 30 2= 1X21= * 0 0 AATAATTAAGTCTACAGAGCAACT ????????????????????????
x6 0 ref2 14 30 1X22= * 0 0 TAATTAAGTCTACAGAGCAACTA ???????????????????????
Login before adding your answer.
Traffic: 1217 users visited in the last hour
what is an "extended cigar string" ? give an example of input / output.
AFAIK nucleotide match/mismatch (X,=) instead of a alignment match (M).
2 examples: - cigar string "100M" MD tag "43C5C43T6" output = "43=C5=C43=T6=".