Filtering multisample vcf file by
1
0
Entering edit mode
7.7 years ago

Hello,

I have a vcf file that contains 200 DO mouse samples. I want to filter the file by SNPs that have at least 5 of each genotype per SNP. Each SNP needs at least 5 AA, AB, and BB. For example, if a SNP has190 AA, 6 AB, and 4 BB then this would be discarded. Or if there are 100 AA, 0 AB, and 100 BB, then this will also be discarded. There needs to be 5 or more for each genotype. How would I go about doing this? I have been trying with vcftools, but not quite getting it to work. The rule doesn't have to be exact, I am just trying to filter SNPs that can give me the most information from telling cell lines apart.

Any help would be greatly appreciated.

Thank you

vcftools genotype • 2.2k views
ADD COMMENT
2
Entering edit mode
7.7 years ago

using vcfilterjs:

java -jar dist/vcffilterjs.jar  -e 'function accept(v) {var nAA=0,nBB=0,nAB=0;for(var i=0;i< v.getNSamples();++i) { var g=v.getGenotype(i);if(g.isHomRef()) {nAA++;} else if(g.isHomVar()) { nBB++;} else if(g.isHet()) { nAB++;}} return nAA>5 && nBB>5 && nAB>5;}accept(variant);' input.vcf
ADD COMMENT

Login before adding your answer.

Traffic: 2729 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6