Hi,
I am trying to call SNPs with at least 1X depth (My estimation is that most of the SNPs I want to see have indeed 1X / 2X depth). I have mapped consensus sequences to the Tomato reference genome (generating in theory 1X/2X depth on specific target positions) with Bowtie2 and perform SNP/INDEL calling with samtools mpileup and bcftools. However, I can't retrieve SNPs with 1X depth although I do mention it on the command line. Can anyone help me?
Commands:
for directory in *; do for file in $directory/*sorted*; do prefix=$(echo $file | rev | cut -f1 -d '/' | rev | cut -f1 -d '.'); samtools mpileup -uf ../../Data/SolanumLycopersicum_gDNA_2.40.fa $file | bcftools view -bvcg - > $directory/$prefix.bcf & done; done
for directory in *; do for file in $directory/*bcf*; do prefix=$(echo $file | rev | cut -f1 -d '/' | rev | cut -f1 -d '.'); bcftools view $file | vcfutils.pl varFilter -d1 -D100 > $directory/$prefix.vcf & done; done
The variant calling is done by
samtools mpileup
, so you'll need to make changes there. I don't actually know that it's possible to call variants from 1x coverage with samtools (most NGS data isn't reliable enough to call variants from 1x coverage).I seem to recall someone posting a tool on seqanswers a while back (years?) that was designed for variant calling from very low coverage (maybe not 1X). If I can find that I'll post the link, but maybe someone else knows what I am talking about and can comment.
Someone on SEQanswers today posted this link that includes a program that might do what you want. You should completely ignore his discussion about samtools and varscan, the author doesn't actually have a clue what he's doing. However, I suspect that the C++ program he wrote will do what you need and function reasonably to do what you want.
I think I actually found a much more easier way. Bcftools has a -p option, which shows all filtered variation (including those with 1 depth). I assume that this solves my issue? ;)
Most likely, good catch