Hello everyone,
My name is Krishna, and I am currently working as a project student in the field of Clinical Genetics, specifically the rare variant disease study of patient exome and whole-genome samples.
I am attempting to convert various representation formats such as Gene symbol, RefSeq ids, HGVSc, HGVSp, dbSNP IDs, and ENST IDs, along with amino acid or nucleotide changes, into their corresponding genomic coordinates. This conversion will aid in better understanding the variations. An example is provided below for more clarity:
I have a variation that can be represented in multiple ways, as shown below:
rs1639927683
OR4F5:p.M1T
OR4F5:p.Met1Thr
OR4F5:c.2T>C
NM_001005484:p.M1T
NM_001005484:p.Met1Thr
NM_001005484:c.2T>C
NP_001005484:p.M1T
NP_001005484:p.Met1Thr
ENST00000641515:c.2T>C
For all the given formats, the corresponding genomic coordinates are 1-65566-T-C.
Q. How can I convert any format of variation to its corresponding genomic coordinates, as shown in the example above?
I have used Ensembl's Variant Recoder, but it operates in online mode, which is time-consuming, especially considering I have multiple variations in different formats.
Can anyone please suggest a more efficient approach to tackle this situation which is preferably offline mode with all the databases?
Thank you very much in advance.
Just to confirm are you referring to the web tool or the command line option you can install locally: https://www.ensembl.org/info/docs/tools/vep/recoder/index.html#vr_dl_install
Command line option does require online access.
Thank you for the reply GenoMax , Yes, I am using the command line tool and it is online