Hi, and thank you all in advance for helping me understand what I see as inconsistencies in VEP.
The variant in question has been published as (VCF format in GRCh38 coordinates):
1 1196415 var1 T C
If I submit this to VEP using curl (or the REST interface; it matters not) I get back the variant as expected:
curl 'https://rest.ensembl.org/vep/homo_sapiens/region?vcf_string=1&hgvs=1' -H 'Content-type:application/json' -H 'Accept:application/json' -X POST -d '{ "variants" : ["1 1196415 myvar T C . . ." ] }'
Among other things, it shows me that this designation is legitimate and correct
"vcf_string":"1-1196415-T-C"
"start":1196415
"end":1196415
"hgvsc":"ENST00000486379.1:c.105T>C"
"hgvsp":"ENSP00000464269.1:p.Ser35="
However, If I then take the output HGVSc and submit it to VEP, I get a different (incorrect) variant that is 1bp removed:
curl 'https://rest.ensembl.org/vep/homo_sapiens/hgvs?vcf_string=1&hgvs=1' -H 'Content-type:application/json' -H 'Accept:application/json' -X POST -d '{ "hgvs_notations" : ["ENST00000486379.1:c.105T>C" ] }'
"vcf_string":"1-1196416-T-C"
"start":1196416
"end":1196416
"hgvsc":"ENST00000486379.1:c.106T>C"
"hgvsp":"ENSP00000464269.1:p.Leu36="
- The reference sequence for both of these locations is 'T'
- The first (1196415) is at amino acid position 35 wheres the second is indeed at 36.
- I also checked the codon table that these are both synonymous variants -- they are.
- These are simple substitutions, so there isn't an issue of shifting
- I tried using the --ambiguous_hgvs flag, but it didn't pull in anything new, and there's nothing amiguous about this given that it has the transcript given.
- Using the ensembl format ("1:1196415-1196415:1/C") gave me the correct (first) variant
I need to look up a fair number of variants, some by HGVS and some by VCF, but it seems like at least sometimes, I'll be getting back the wrong variant.
Why aren't these the same? What am I not understanding here?
Thank you very much for your time!
i think you've discovered a bug
Thank you for adding support to what I suspected. I have filed a report here. In that bug report, I noticed that the cDNA and CDS are 1bp different from one another. I wonder if the HGVS query is querying the cDNA but then returning the coordinates and nomenclature as if it had queried the CDS.