filtering multi sample vcf file
1
0
Entering edit mode
2.3 years ago
Peerzada • 0

Hello all ,

I have a multi sample vcf file of 1000 individuals for one gene . Some of the samples do not have any variants in them like "0/0".I want to filter them out and keep only the samples with variants associated like "0/1" and "1/1".How can I do this .Kindly give a command and tool for the same .

vcf • 1.7k views
ADD COMMENT
0
Entering edit mode

you could use bcftools view for that, check the doc here: http://samtools.github.io/bcftools/bcftools.html#view

ADD REPLY
0
Entering edit mode

Thanks you .What command should I precisely use from that ?

ADD REPLY
0
Entering edit mode
2.3 years ago

using http://lindenb.github.io/jvarkit/VcfFilterJdk.html

 java -jar dist/vcffilterjdk.jar -e 'final Set<String> samples=new HashSet<>(Arrays.asList("S2","S3")); return samples.stream().map(S->variant.getGenotype(S)).allMatch(G->G.isHet() || G.isHomVar()) && variant.getGenotypes().stream().filter(G->!samples.contains(G.getSampleName())).noneMatch(G->G.isHet() || G.isHomVar());'  in.vcf.gz
ADD COMMENT
0
Entering edit mode

I installed the tool and ran the command above but it showing some errors like "java.lang.RuntimeException: Cannot compile" and is not producing any output.

ADD REPLY
0
Entering edit mode

show me the command and the complet stack trace

ADD REPLY
0
Entering edit mode
git clone "https://github.com/lindenb/jvarkit.git"
cd jvarkit
./gradlew vcffilterjdk

java -jar dist/vcffilterjdk.jar -e 'final Set<String> samples=new HashSet<>(Arrays.asList("S2","S3")); return samples.stream().map(S->variant.getGenotype(S)).allMatch(G->G.isHet() || G.isHomVar()) && variant.getGenotypes().stream().filter(G->!samples.contains(G.getSampleName())).noneMatch(G->G.isHet() || G.isHomVar());'  aqp1.1000g.vcf.gz
ADD REPLY
0
Entering edit mode

S2 and S3 should be your sample names...

ADD REPLY
0
Entering edit mode

I have vcf file wth 1000 samples and i need only those samples which contain variants like 0/1 ,1/0 and 1/1 and remove samples with 0/0 . so what should I write in place of s1 and s2

ADD REPLY
0
Entering edit mode

Ah, so your question was not clear to me ; You cannot remove samples from a VCF file for one variant and not for the other variants of the file.

ADD REPLY
0
Entering edit mode

Thank you . Now I have annotated multi sample vcf file using snpEff . How can I filter exonic variants in the separate file and intronic as well as other UTRs in separate file .which command can I use to separate out exonic variants .

ADD REPLY
0
Entering edit mode

this is unrelated to your original question.

ADD REPLY

Login before adding your answer.

Traffic: 1352 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6