I followed the samtools/bcfutils/vcfutils pathway followed here to convert a set of human Hg19-aligned BAM files into a set of raw VCF files. I then got vcftools to filter down to just autosomal SNPs. These are really, really, really low-coverage genomes (they were enriched for NRY and/or mtDNA, and I am just trying to make use of the "leftovers")
Now I have the data I want, but I am trying to found out what of it is actually usable. I was wondering what are good filtering parameters for tossing/keeping human SNPs (or where can I find said parameters)? Thanks!
-Deven
@Ashutosh -
could you provide reasoning as to why those are the thresholds you typically use? It would be helpful so researchers can understand the parameters better! Thanks!
You are right to question this - indeed, there are absolutely no standards for these filtering criteria. Take a look at my take on DP alone: A: DP in VCF files?
I know vcftools can filter based on DP/Qual, do you have any recommendations on what to use to do the other filtering? Thanks!
This one does almost everything that's mentioned above.
I have my own python script. If you know python you can modify it for your use. OR you can use vcf-tools "annotate" feature. I think the second option will be much better.