I have reference mapped paired end illumina reads and called variants using BWA and Samtools respectively. The resulting vcf was treated to remove high coverage SNPs with vcfutils.pl varFilter -D30
, and then filtered for low quality SNPs using awk '($3=="*"&&$6>=50)||($3!="*"&&$6>=20)'
. I graphed the distribution of SNP quality and observed a huge peak at 222., I repeated it with other samples and observed the same peak. Any clues as to why I may be seeing this?
I should also mention that I filtered with vcfutils varFilter -d 5 -D 25... mapping and snp calling were executed with the same software as Juliofdiaz