filter vcf to include only sites which have at least one homozygous alt call
1
1
Entering edit mode
5.1 years ago
grey ▴ 40

Hi all,

I am trying to extract a subset of sites from a multi-sample vcf file. I only want those sites with at least one homozygous alternative allele call. Some sites are poly-allelic so sites that I want would have at least one 1/1 or 2/2 call.

I've looked through bcftools, vcftools, but haven't found an elegant solution. For example, from the documentation it seems bcftools --genotype can filter for sites with at least one homozygous call, but these could be ref or alt, and I want specifically alt calls.

Any ideas would be greatly appreciated!

vcf • 3.9k views
ADD COMMENT
0
Entering edit mode

You mentioned that "Some sites are poly-allelic." I’d like to ask, how are such sites represented in a VCF file? Does "poly-allelic" correspond to "biallelic sites"?

ADD REPLY
2
Entering edit mode
5.1 years ago
 bcftools view -i 'N_PASS(GT="AA")>=1'  in.vcf

not tested: I don't know if it will accept 1/2 genotypes too.

ADD COMMENT
3
Entering edit mode

bcftools view -i 'GT="AA"' in.vcf should be enough, as the manual differs between AA (alt-alt hom) and Aa/aA (alt-alt het)

ADD REPLY
0
Entering edit mode

Works perfect thank you

ADD REPLY

Login before adding your answer.

Traffic: 2725 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6