How to filter VCF for only variants present in particular sample?
3
1
Entering edit mode
2.3 years ago

I would like to extract VCF subset with only those variants that are present in one particular sample. E.g. i have three sample VCF with ten variants. Only one variant has been detected in the first sample, so i would like to subset VCF so that only this variant is kept. Pseudo code would be something like:

bcftools query -f'%CHROM:%POS %INFO [%GT]\n' -i'GT="alt in sample1"' file.vcf

The desired output is VCF with three samples but only variants present in one of the samples, in this case sample1

vcf • 1.5k views
ADD COMMENT
1
Entering edit mode
2.3 years ago

Alternative solution is to use bcftools +split plugin, e.g.:

bcftools +split file.vcf -Ob -o testDir -i'GT="alt"'
ADD COMMENT
1
Entering edit mode
2.3 years ago

using vcffilterjdk http://lindenb.github.io/jvarkit/VcfFilterJdk.html

java -jar dist/vcffilterjdk.jar -e 'String sample="S1"; final Genotype g=variant.getGenotype(sample);if(g.isHomRef() || g.isNoCall()) return false; return variant.getGenotypes().stream().filter(G->G.getSampleName().equals(sample)==false).allMatch(G->G.isNoCall() || G.isHomRef());'  in.vcf
ADD COMMENT
0
Entering edit mode
2.3 years ago

The solution is to add -s option, e.g.:

bcftools query -f'%CHROM:%POS \n' -s sample1 -i'GT="alt"'
ADD COMMENT
0
Entering edit mode

However, more desirable solution would avoid using -s option.

ADD REPLY
0
Entering edit mode

How so? That is a valid option for the command and how it is meant to used.

 -s, --samples LIST                List of samples to include
 -S, --samples-file FILE           File of samples to include
ADD REPLY
0
Entering edit mode

Because many plugins does not support -s option, more specifically it's not present in +split-vep

ADD REPLY

Login before adding your answer.

Traffic: 1352 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6