Using VEP: Variant Effect Predictor, how do I get the 1000G global Allele Frequency of my alternate alleles specifically?
1
0
Entering edit mode
8.1 years ago
anp375 ▴ 190

Let's say I have this variant:

1 123141 . A C

My alternate allele at this site is C. The reported global MAF in 1000G is 0.8564. However, this is the allele frequency for A at that site, not C.

Here are the options for running VEP: http://useast.ensembl.org/info/docs/tools/vep/script/vep_options.html These are the VEP options I am using:

--check_existing --check_alleles --gmaf --maf_1kg --maf_esp

======================================================================

The cells given by --maf_1kg seem to give the allele frequency of my alt allele, with these options. But I need the global allele frequency for my alt allele. I'm still trying to figure out how this works. In the process of writing this question, I'm finding new problems that aren't making sense.

For rs200645137 +C insertion, VEP gives me this:

       AFR_MAF|AMR_MAF |EAS_MAF|EUR_MAF |SAS_MAF
              |C:0.4024|C:0.353|C:0.4365|C:0.2843

AFR_MAF is missing.

From dbsnp: https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=200645137 I get:

       AFR_MAF |AMR_MAF|EAS_MAF |EUR_MAF |SAS_MAF
       C:0.4024|C:0.353|C:0.4365|C:0.2843|C:0.456

So, really, SAS_MAF was missing and all the numbers were misaligned. VEP also gives me a global minor allele frequency of:

GMAF: -:0.4837

which is also in dbsnp.

Annovar gives me 1-that, or 0.516291, because I have the C present. But even that does not make sense to me. But when I calculate the allele frequency myself, I get this:

1008*0.4365+1006*0.2843+1322*0.4024+694*0.353+978*0.456==1948.921
1008+1006+1322+694+978==5008
1948.921/5008==0.3891615

So the 1000Genomes AF for my alt allele is 0.3891615.

What is going on!? I'm definitely using 1000G phase 3 data.

Edit: Okay, the global allele frequencies don't match because it's on the X-chromosome and not everyone has two of those. But the VEP columns still don't match up. The SAS_MAF is being shoved into the next column, and AFR_MAF is either empty, or is filled with a number I can't find anywhere else. The ESP columns for AA_MAF and EA_MAF don't seem to have the correct numbers either.

VEP Variant Effect Predictor Annovar • 3.6k views
ADD COMMENT
0
Entering edit mode
7.9 years ago
EnsemblWill ▴ 570

So, really, SAS_MAF was missing and all the numbers were misaligned

This issue was noticed by a few users who had the version 85 GRCh37 VEP cache. AFAIK the issue is not present in the version 86 or 87 cache files, so if you are able to update then please use the newer version (you could also use the 84 cache if you are tied to v85 of the software for any reason, the data should be identical).

VEP also gives me a global minor allele frequency of: GMAF: -:0.4837

This is a running issue with the way VEP reports allele frequencies; VEP reports only the non-reference allele(s) and frequency, not necessarily the frequency of the allele as input.

It has been resolved in the new rewrite of VEP available in beta from https://github.com/Ensembl/ensembl-vep

ADD COMMENT

Login before adding your answer.

Traffic: 1631 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6