Extract samples with alternative genotype from a multi-sample VCF
1
0
Entering edit mode
4.0 years ago
tacrolimus ▴ 140

Dear all,

I have a multisample vcf and want to extract the IDs of those with rare impactful SNPs with at least one alternative allele at the a loci within the region of interest.

I am running the following on a normalised multi-sample vcf which has been annotated with VEP:

bcftools view -r REGION OF INTEREST | filter_vep -filter 'IMPACT is MODERATE or IMPACT is HIGH' |  filter_vep -filter MAX_AF <0.01 or not MAX_AF'' | bcftools view -i 'GT="alt"' | bcftools query -f "'[%CHROM:%PO %SAMPLE %GT\n]' >output.txt

I work in an airlocked environment with patient data so can't share the VCF I am using but it is a standard multi-sample vcf created by merging using bcftools.

I get an output with a variant and ID per line irrespective of the genotype e.g.:

chr1:123:456 sampleID123 ./.
chr1:123:456 sampleID234 ./.
chr1:123:456 sampleID156 0/1.
chr1:123:456 sampleID123 ./.

etc

I would like to just get the sample IDs with at least one alt genotype.

Many thanks

vcf bcftools vep • 2.0k views
ADD COMMENT
0
Entering edit mode

chr1:123:456 sampleID156 0/1.

I don't understand your problem. In your example above , there is a "sample IDs with at least one alt genotype."

ADD REPLY
0
Entering edit mode

yes the 0/1 variant is the one I want but the output also includes those with two ./. variants. I only want those with at least 0/1.

ADD REPLY
0
Entering edit mode
4.0 years ago
abascalfederico ★ 1.2k

If you just want those lines with a "1" in the third column, you can add an awk command:

bcftools view -r REGION OF INTEREST | filter_vep -filter 'IMPACT is MODERATE or IMPACT is HIGH' |  filter_vep -filter MAX_AF <0.01 or not MAX_AF'' | bcftools view -i 'GT="alt"' | bcftools query -f "'[%CHROM:%PO %SAMPLE %GT\n]' | awk '$3 ~ /1/ { print }' >output.txt
ADD COMMENT
0
Entering edit mode

True, I could also just grep '0/1' '1/1' etc but I was wondering what the error with the code above was? Thanks for taking the time to answer.

ADD REPLY

Login before adding your answer.

Traffic: 1882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6