I have a VCF file generated from ~100 whole-genome samples. My aim is to identify variants segregating under a recessive model where I have three affected samples and one healthy carrier and the remaining 96 samples are clear controls i.e homRef. I have used the below javascript file to show variants that are homozygous for the alternate allele in the 3 affected samples and het in the carrier and wildtype in the 96 samples:
function accept(ctx) {
for(var i=0;i< ctx.getNSamples();++i)
{
var g = ctx.getGenotype(i);
var sample = g.getSampleName();
if ( (sample=="BA1" && sample=="BA2" && sample=="BA3") && (g.isHomVar())) continue;
if ( (sample=="BC4") && (g.isHet())) continue;
if ( g.isHomRef() || g.isNoCall()) continue;
return false;
}
return true;
}
accept(variant);
BA1,BA2 and BA3 are affected samples and BC4 is carrier.
Command used:
picard FilterVcf I=WGS_multianno.vcf MIN_DP=0 MIN_GQ=0 JS=JavaScript O=BC.vcf
The output has the below tags in the FILTER column:
FILTER
AllGtsFiltered;JavaScript
JavaScript
PASS
However, inspecting the variants tagged under “AllGtsFiltered;JavaScript”, “JavaScript” or “PASS” does not seem to segregate with the inheritance model coded in the Javscript. Could you please help if the above script is going wrong somewhere?
Also, the above picard command took 2.2hrs to generate the output. Is there a better way to speed-up the processing?
The output has the below two tags in the FILTER field:
And variants with each of these tags do not seem to segregate as per the javascript file where the 3 samples should be homVar.
Could you let me know if the code is going wrong or if i am interpreting it wrong.
I think there is a misunderstanding: if the FILTER==RecessiveJS then the variant has FAILED to match the filter.
Yes, but still the variants under FILTER=AllGtsFiltered;RecessiveJS doesn't match the filter where i have asked the variant to be homozygous for all the 3 samples but the result has mixed genotypes i.e 0/0, ./. and 1/1
I sill don't understand:
so the 3 samples fails your filter, so there is something in the FILTER column. What about the variant having PASS in the filter column ?
I will try to explain in a different way. The genotype for the samples BA1,BA2 and BA3 should be homozygous for the alternate allele i.e. BA1=BA2=BA3=1/1 and BA4 should be heterozygous i.e 0/1 and the genotypes for the remaining samples could be either 0/0 or ./. .
In the above commands
cut -f30,31,34
30,31 and 34 represents BA1,BA2 and BA3 but the genotypes are not 1/1 for the 3 samples with either FILTER=RecessiveJS or FILTER=AllGtsFiltered;RecessiveJS . And variants with FILTER=PASS do not exist in the outputfile.can you please show me
grep "#CHROM" -m 1 BC.vcf |cut -f30,31,34
Does the above output help?
It works on a different test case where i have simulated the input to match the criteria specified in javascript. In the above case there seems to be no variant meeting the criteria in the javascript and so there is no variant in the output with flag "PASS".