How to do INDEL alignment
1
0
Entering edit mode
7.9 years ago
es.oh • 0

Hello,

I need your helps.

I have NGS data obtained from a lung cancer patient, and the driver mutation is a complex mutation where insertion and deletion are combined.

The inserted bases are CCCCT and the deleted ones are TCTTTCTTTCTCTCTGTTTTAAGATC.

Below is the one of the reads that are harbouring the complex mutation.

TATGTAGTCCATAAAACCCATGAGTTCTGGGCACTGGGTCAAAGTCTCCTGGGGCCCATGATAGCCGTCTTTAACAAGCCCCCTTGGGCAGTGAATTAGTTCGCTACGATGCAAGAGTACACACTCCTCATTTGGATAGGCTTGT

The CIGAR string of the read alined by BWA is 84M21D61M.

BWA did not recongize the inserted bases, aligning them as mismatched bases.

I need the alignment results with both insertion and deletion.

How can I do this?

Below is the ideal alignment results that I hope.

ref   : TATGTAGTCCATAAAACCCATGAGTTCTGGGCACTGGGTCAAAGTCTCCTGGGGCCCATGATAGCCGTCTTTAACAAGC-----TCTTTCTTTCTCTCTGTTTTAAGATCTGGGCAGTGAATTAGTTCGCTACGATGCAAGAGTACACACTCCTCATTTGGATAGGCTTGT

query : TATGTAGTCCATAAAACCCATGAGTTCTGGGCACTGGGTCAAAGTCTCCTGGGGCCCATGATAGCCGTCTTTAACAAGCCCCCT--------------------------TGGGCAGTGAATTAGTTCGCTACGATGCAAGAGTACACACTCCTCATTTGGATAGGCTTGT
alignment • 3.7k views
ADD COMMENT
1
Entering edit mode

bbmap (from BBMap suite) is good at recognizing indels? Can you give it a try?

ADD REPLY
0
Entering edit mode
7.9 years ago
Tommy Au ▴ 70

If your goal to have such "ideal alignment" is accurately calling the variant as a single variant in VCF format (complex indel), you may want to take a look at INDELseek, which calls such complex indel directly from the BWA alignments.

I tried to reproduce your situation using your example sequence read. My BWA-MEM v0.7.7 returned the same CIGAR string 84M21D61M. INDELseek calls the MET exon 14 complex indel as expected:

>cat biostars227375.fastq
@seq1
TATGTAGTCCATAAAACCCATGAGTTCTGGGCACTGGGTCAAAGTCTCCTGGGGCCCATGATAGCCGTCTTTAACAAGCCCCCTTGGGCAGTGAATTAGTTCGCTACGATGCAAGAGTACACACTCCTCATTTGGATAGGCTTGT
+
GGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG


>~/tools/bwa-0.7.7/bwa mem -R '@RG\tID:biostars227375\tSM:biostars227375' -M ~/bundle_2.8_hg19/ucsc.hg19.fasta biostars227375.fastq | ~/tools/samtools-1.3.1/samtools view -bS - | ~/tools/samtools-1.3.1/samtools sort > biostars227375.bam && ~/tools/samtools-1.3.1/samtools index biostars227375.bam
>~/tools/samtools-1.3.1/samtools view biostars227375.bam | cut -f1-6
seq1    0       chr7    116411801       60      84M21D61M


>~/tools/samtools-1.3.1/samtools view biostars227375.bam | ~/tools/indelseek/indelseek.pl --samtools ~/tools/samtools-1.3.1/samtools --refseq ~/bundle_2.8_hg19/ucsc.hg19.fasta | grep -v "^#"
chr7    116411880       .       TCTTTCTTTCTCTCTGTTTTAAGATC      CCCCT   38.00   LowDepth        DP2=1,0;QS2=38.00,-1.00
ADD COMMENT
2
Entering edit mode

edited to fix broken URL

ADD REPLY

Login before adding your answer.

Traffic: 1530 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6