Entering edit mode
9.4 years ago
Rm
8.3k
Looking for a tool to map variants (say from VCF) to "named variants" (for example: UGT1A1*28 , DPYD*3) to variant coordinates or rsid?
Biobase PGMD supports the same but need subscription I believe similar Question: Which Databases Carry Named Gene Variants Like Apoe4
@Ying: My question is more specific: Link a RSID (rs8175347 or 2:234668881-234668882) to "Named Variant"
UGT1A1*28
Your "named variant" is not a standard notation that I am aware of.
VEP (mentioned above) will produce HGVS notations from either rsIDs or coordinates+alleles, but you have a little work to do.
Here's an example command using the VEP command line tool:
I've used
--pick
to choose one amongst the many alternate transcripts for the UGT1A6. Note that the HGVS notation is given relative to a specific transcript so the variant can be resolved unambiguously; giving notations against a gene name alone can give rise to difficulty comparing notations from different systems or transcript sets.The [GeneName]*[Number] format that the OP refers to as "Named Variant" is the way that different alleles of a gene are designated. An allele could be a combination of specific variants/rsIDs. For example, the entry for DPYD*3 can be found here and if you click on download translation table or the entry, you will find that allele *3 is defined by having 1897delG. After some googling, this application might be able to get you the info you are looking for (since it has DPYD*3's haplotype ID and rs#s in a plain text database, however, it does not have info on UGT1A1*28). It also seems like there is a limited number of genes with "named variants", maybe its possible to just download it somehow.
Thanks @Ying W: Yeah I am manually doing the linking of named variants from Pharmgkb. Let me explore tool you suggested.
Thanks @EnsemblWill : My inputs are list of named variants like
UGT1A1*28
then to need to map to rs or coordinates...Ah OK, sorry misunderstood the direction of conversion.
VEP can also take HGVS as input, but this named variant does not look like an HGVS name to me. Anyone know what this convention is and how it might be parsed?