Is there a tool or method for filtering the vcf file in the following manner.
I want to run a multisample vcf file to select the lines/site that have same genotypes in all the samples.
Thanks,
Is there a tool or method for filtering the vcf file in the following manner.
I want to run a multisample vcf file to select the lines/site that have same genotypes in all the samples.
Thanks,
Using vcffilterjs http://lindenb.github.io/jvarkit/VCFFilterJS.html
java -jar dist/vcffilterjs.jar -e 'function accept(vc) {for(var i=1;i<vc.getNSamples();i++) if(!vc.getGenotype(0).sameGenotype(vc.getGenotype(i))) return false; return true;}accept(variant); ' input.vcf
Hi @Pierre: Thanks for the answer. Btw, you suggested this tool yesterday in another question. I couldn't find the proposed jar
file but only java
file. I tried to find it but couldn't and had to let go. Can you please provide a link for the jar file?
Thanks much,
http://lindenb.github.io/jvarkit/VCFFilterJS.html#download-and-compile
Download and Compile
$ git clone "https://github.com/lindenb/jvarkit.git"
$ cd jvarkit
$ make vcffilterjs
The *.jar libraries are not included in the main jar file, so you shouldn’t move them (https://github.com/lindenb/jvarkit/issues/15#issuecomment-140099011 ). The required libraries will be downloaded and installed in the dist directory.
@Pierre. Thanks it worked. I tried to read your script to see how I can apply any modified changes, but couldn't. So, if I want to select the line that have same GT, but want to relax a little bit when the GT isnot called (./.). How would I do it? Say, I can accept 1/1, 1/1, 1/1, 1/1 for 4 samples when 1 other sample is ./. (no call).
Thanks,
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hello kirannbishwa01!
It appears that your post has been cross-posted to another site: http://gatkforums.broadinstitute.org/gatk/discussion/comment/39095
This is typically not recommended as it runs the risk of annoying people in both communities.
Thanks for the update. I understand the problem. I had to repost the question in this forum because, I had not been getting the solution to the problem (sometimes no answer and sometimes not the right one), and it is just beyond patience to wait for the answer for couple days when you need to move on with your data analyses.
I would have deleted the question on GATK forum, but unlike in Biostars that's not possible with GATK forum, once its there its there unless the admin deletes it. I hope you understand that it was not something intended.