Inconsistency between VCF and HGVS
0
1
Entering edit mode
4 days ago
Senanu ▴ 30

Hi, and thank you all in advance for helping me understand what I see as inconsistencies in VEP.

The variant in question has been published as (VCF format in GRCh38 coordinates):

1 1196415 var1 T C

If I submit this to VEP using curl (or the REST interface; it matters not) I get back the variant as expected:

curl 'https://rest.ensembl.org/vep/homo_sapiens/region?vcf_string=1&hgvs=1' -H 'Content-type:application/json' -H 'Accept:application/json' -X POST -d '{ "variants" : ["1 1196415 myvar T C . . ." ] }'

Among other things, it shows me that this designation is legitimate and correct

"vcf_string":"1-1196415-T-C"
"start":1196415
"end":1196415
"hgvsc":"ENST00000486379.1:c.105T>C"
"hgvsp":"ENSP00000464269.1:p.Ser35="

However, If I then take the output HGVSc and submit it to VEP, I get a different (incorrect) variant that is 1bp removed:

curl 'https://rest.ensembl.org/vep/homo_sapiens/hgvs?vcf_string=1&hgvs=1' -H 'Content-type:application/json' -H 'Accept:application/json' -X POST -d '{ "hgvs_notations" : ["ENST00000486379.1:c.105T>C" ] }' 

"vcf_string":"1-1196416-T-C"
"start":1196416
"end":1196416
"hgvsc":"ENST00000486379.1:c.106T>C"
"hgvsp":"ENSP00000464269.1:p.Leu36="
  • The reference sequence for both of these locations is 'T'
  • The first (1196415) is at amino acid position 35 wheres the second is indeed at 36.
  • I also checked the codon table that these are both synonymous variants -- they are.
  • These are simple substitutions, so there isn't an issue of shifting
  • I tried using the --ambiguous_hgvs flag, but it didn't pull in anything new, and there's nothing amiguous about this given that it has the transcript given.
  • Using the ensembl format ("1:1196415-1196415:1/C") gave me the correct (first) variant

I need to look up a fair number of variants, some by HGVS and some by VCF, but it seems like at least sometimes, I'll be getting back the wrong variant.

Why aren't these the same? What am I not understanding here?

Thank you very much for your time!

VEP • 294 views
ADD COMMENT
0
Entering edit mode

i think you've discovered a bug

ADD REPLY
2
Entering edit mode

Thank you for adding support to what I suspected. I have filed a report here. In that bug report, I noticed that the cDNA and CDS are 1bp different from one another. I wonder if the HGVS query is querying the cDNA but then returning the coordinates and nomenclature as if it had queried the CDS.

ADD REPLY

Login before adding your answer.

Traffic: 2044 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6