Counting the specific number of a type of variant per sample in a multi-sample VCF
1
1
Entering edit mode
23 months ago
tacrolimus ▴ 140

Dear Biostars,

I have 80,000 people in a multi-sample VCF which is chunked into roughly 1500 files that span the human genome annotated with VEP including the LOFTEE tool.

I would like to count per chunk the number of High Confidence (annotated as "HC") loss of function variants (LoF) there are per person (either her or hom) per chunk and then add these up so I can see how many HC LoF variants there are per person.

I was initially just using grep to pull out "HC", running bcftools view with the "alt" command and then converting these chunked files into plink files to use it's count function but I was wondering if there was a clean way of doing this directly from a VCF without intermediate steps?

Many thanks for your help

(Similar question to: Count Of Variants)

bcftools VCF • 941 views
ADD COMMENT
3
Entering edit mode
23 months ago

how about :

bcftools concat -a --file-list your.vcf.txt -O u | bcftools view -O u -i '<EXPRESSION>' | bcftools stats -
ADD COMMENT
0
Entering edit mode

would the Expression by something like "HC", how would I define that? An example line with a HC call is below:

1       1178848 rs115005664     G       A       1000.0   PASS   AC=1;AF=0.5;AN=2;CSQ=A|ENSG00000184163|ENST00000468365|Transcript|non_coding_exon_variant&nc_transcript_variant|445|||||||-1|||,A|ENSG00000184163|ENST00000462849|Transcript|upstream_gene_variant|||||||5|-1|||,A|ENSG00000184163|ENST00000486627|Transcript|downstream_gene_variant|||||||513|-1|||,A|ENSG00000184163|ENST00000330388|Transcript|stop_gained|648|616|206|Q/*|Cag/Tag|||-1|||HC,A|ENSG00000184163|ENST00000478606|Transcript|upstream_gene_variant|||||||310|-1|||`

With the corresponding part of the VCF being:

##INFO=<ID=CSQ,Number=.,Type=String,Description="Consequence type as predicted by VEP. Format: Allele|Gene|Feature|Feature_type|Consequence|cDNA_position|CDS_position|Protein_position|Amino_acids|Codons|Existing_variation|DISTANCE|STRAND|LoF_flags|LoF_filter|LoF">
ADD REPLY
1
Entering edit mode

something like: -i 'INFO/CSQ ~ "|HC"'

ADD REPLY

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6