Hi all,
I am using Ensembl VEP (command line) to annotate a VCF I have. I am specifically looking for gnomAD allele frequencies, which is fairly straight forward to do, technically speaking. However, the data looks off in some cases.
For example, when I pass in:
10 69408929 COSM3751912 A T . . GENE=TACR2;STRAND=-;CDS=c.734T>A;AA=p.M245K
I get the VEP output:
COSM3751912 10:69408929 T ENSG00000075073 ENST00000373306 Transcript missense_variant 1278 734 245 M/K aTg/aAg rs55953810 MODERATE - -1 - TACR2 HGNC HGNC:11527 YES ENSP00000362403 P21452 - UPI0000061EE3 - 3/5 - Gene3D:1.20.1070.10,Pfam_domain:PF00001,PROSITE_profiles:PS50262,hmmpanther:PTHR43919,hmmpanther:PTHR43919:SF4,SMART_domains:SM01381,Superfamily_domains:SSF81321,Conserved_Domains:cd16004 1 1 1 1 1 0.9999 0.9999 0.9999 1 0.9999 1 0.9999 1 1 1 gnomAD_ASJ,gnomAD_FIN,gnomAD_OTH,gnomAD_SAS,AFR,AMR,EAS,EUR,SAS - - - - - - -
Jumbling through that, you can see the allele frequencies for gnomAD_AF
is 0.9999. This seems odd to me. How could this variant be a COSMIC (cancer database) missense variant, with MODERATE
consequence, and have 99.99% frequency. I'm lost on how to interpret this.
Maybe I am misunderstanding how gnomAD scores allele frequencies, hence posting this question here.
Does anyone know how gnomAD allele frequencies (as outputted by Ensembl's VEP) should be interpreted?
For some reason, it's giving you the 1-AF and not the AF. The variant at the chromosome level is an A>T with AF < 0.01, but at the transcript level becomes a T>A. Maybe that's the reason the AF is being inverted?
Interesting. Could that be because the strand is
-1
?Answered my own question by looking at more data. Nope, there are plenty of
+1
strand entries with the same frequency issue as the first example.there is no variant in gnomad at this position: http://gnomad.broadinstitute.org/variant/10-69408929-A-T
I did notice that as well. So why would VEP be returning anything then? Seems odd (again). There are other entries with a
-
in place of the AF (I'm assuming that means no data).Can it be used to annotate a vcd file?
Why is this added as an answer? Please delete it and add a comment/reply to an appropriate post.
EDIT: After nearly 2 years of no activity from Ahmed, I've moved this post to a comment on the top level post.