Hi All,
I have probably a simple problem but which I cannot solve as of yet and puzzles me quite a bit. Using samtools mpileup and bcftools I am unable to call a small 6 base INSERT compared to my reference. Single SNPs are called ok. I have looked at many unanswered questions and potential answers here at BioStar and this question/answer gets closest but none actually has a working answer!
My play aorund data:
- Illumina PE 250 dataset of bacterial genome (2Mb at coverage 5k+)
- Assembled and reconstructed a genome from it
- If I check for SNPs by mapping the raw reads to the fresh assembly none are found (underlining the correct assembly)
Next:
- I created an artificial 2 position SNP and somewhere else I deleted 6 bases in the assembly
- I mapped the original reads which should detect the SNPs and the INSERT. Hoewever I only see the SNPs in the vcf and not the INSERT
My mapping is done with bowtie2 --local-sensitive settings to the ArtificialREFERENCE-6del sequence
Sam file is converted to bam and sorted using samtools
Then I call variants:
samtools mpileup -uf ArtificialREFERENCE-6del.fasta SORTED_BAMfile.bam -o SAMPLE_file.bcf
bcftools call -vm -o SAMPLE_file.vcf SAMPLE_file.bcf
I tried different bcftools settings like -c and -p 0.5 ... no luck...
Any help would be appreciated! It is probably something very obvious but I fail to see it from the documentation.
NB: I tried samtools/bcftools version 1.3.1 but also an older 1.1
NB2: I tried the additional options and FLAGS as proposed in the above linked post without luck.
NB3: The below is an IGV snippet of the involved BAM file clearly showing there IS a 6 base insert.
No one sees this behaviour?
I can confirm that it not only occurs with my data but also with at least another case when mapping/consensus calling viral sequences!