Getting gnomAD allele frequencies for checking against populations
2
Hi all, I have scoured the internet for an answer and come up short.
I have a multi ancestry genomic dataset (TopMED r3 imputed, hg38) that I have split up into each respective super population (EUR, AFR, SAS, EAS, AMR).
I want to check my data's allele frequency in each population against reference populations. How do I make a reference set of allele frequencies annotated for each super pop using gnomAD?
Thank you.
gnomAD
ancestry
allele-frequency
• 155 views
The --af_gnomadg
flag in VEP will annotate variants with the AF's in those superpopulations. So something like:
docker run \
ensemblorg/ensembl-vep \
vep \
-i input.vcf \
-o output.vcf \
--cache \
--force_overwrite \
--fork 2 \
--format vcf \
--buffer_size 5000 \
--terms SO \
--symbol \
--ccds \
--variant_class \
--hgvs \
--hgvsg \
--force \
--dont_skip \
--no_stats \
--pick_allele \
--vcf \
--show_ref_allele \
--af_gnomadg
then the INFO field will have a bunch of junk that includes those gnomad afs:
chr1 120994 . AAT CAT . . CSQ=CAT|intron_variant&non_coding_transcript_variant|MODIFIER||ENSG00000238009|Transcript|ENST00000477740|lncRNA||1/3|ENST00000477740.5:n.164-62delinsG|||||||rs866349668|AAT||-1||substitution|||||chr1:g.120994delinsC|0.000292|0.001101|0|0|0|0|0|0.003876|0|0.0006803|0|||| GT 0/1
where
0.000292|0.001101|0|0|0|0|0|0.003876|0|0.0006803|0
refers to the AF in these populations:
gnomADg_AF|gnomADg_AFR_AF|gnomADg_AMI_AF|gnomADg_AMR_AF|gnomADg_ASJ_AF|gnomADg_EAS_AF|gnomADg_FIN_AF|gnomADg_MID_AF|gnomADg_NFE_AF|gnomADg_OTH_AF|gnomADg_SAS_AF
What end result are you expecting?
What should your resultant file look like?
Can you not use the dbNSFP database?
Login before adding your answer.
Traffic: 1596 users visited in the last hour