different ID for 'gene name' vs 'gene synonym' in ENSEMBL.
1
0
Entering edit mode
3.2 years ago
wiscoyogi ▴ 40

I'm finding that the a gene name corresponds to different Ensembl IDs depending if it's a 'gene name' or a 'gene synonym' in ENSEMBL. my understanding was that ENSEMBL ID were unique. there's one location on the genome for each of these genes, so im very unclear how they are multiple Ensembl ID when a gene is a 'gene name' vs a 'gene synonym' in ENSEMBL.

context for the problem: I got a list of genes from a collaborator and I wanted to look at their expression in a new dataset.

Since differences in gene names (e.g using gene synonyms) between datasets can lead to loss of information if they're not accounted for, to make sure that I’m not missing anything from my dataset, I got a list of all the gene synonyms for each gene by looking at the ‘Gene Synonyms’ under ENSEMBL.

For example, one gene on the list was ‘SLC6A2’. I found synonyms under the same ENSEMBL ID, called ‘NAT1’, ‘SLC6A5’, and ‘NET1’. All of these had the Ensembl ID 'ENSG00000103546'.

But if I look at NAT1, NET1, and SLC6A5 on Ensembl separately, I also separate Gene Stable IDs, for each of these genes ('ENSG00000171428', 'ENSG00000173848', 'ENSG00000165970', respectively).

So I’m wondering how NAT1/ SLC6A5/NET1 are synonyms of SLC6A2 if they each have their own Gene stable IDs in addition to sharing one with SLC6A2.

Do I just use SLC6A2 in my analysis or also account for all the other gene synonyms?

I know there's been discussion on this platform, but is "gene name" synonymous with 'HUGO' nomenclature? Why am I getting different ensembl gene ids for a given gene symbol? I think my issue is different because I see only one ENSEMBL ID per 'gene name', but if the same name is in 'gene synonym' I get a different ENSEMBL ID.

annotations gene Ensembl name • 2.3k views
ADD COMMENT
0
Entering edit mode

The gene synonyms should point to the same genome location for a particular genome build. I will illustrate this using Entrezdirect:

$ esearch -db gene -query "SLC6A2 [gene] AND human [ORGN]" | efetch

1. SLC6A2
Official Symbol: SLC6A2 and Name: solute carrier family 6 member 2 [Homo sapiens (human)]
Other Aliases: NAT1, NET, NET1, SLC6A5
Other Designations: sodium-dependent noradrenaline transporter; neurotransmitter transporter; norepinephrine transporter; solute carrier family 6 (neurotransmitter transporter), member 2; solute carrier family 6 (neurotransmitter transporter, noradrenalin), member 2; solute carrier family 6 member 5
Chromosome: 16; Location: 16q12.2
Annotation: Chromosome 16 NC_000016.10 (55655928..55706192)
MIM: 163970
ID: 6530

$ esearch -db gene -query "NET [gene] AND human [ORGN]" | efetch

1. SLC6A2
Official Symbol: SLC6A2 and Name: solute carrier family 6 member 2 [Homo sapiens (human)]
Other Aliases: NAT1, NET, NET1, SLC6A5
Other Designations: sodium-dependent noradrenaline transporter; neurotransmitter transporter; norepinephrine transporter; solute carrier family 6 (neurotransmitter transporter), member 2; solute carrier family 6 (neurotransmitter transporter, noradrenalin), member 2; solute carrier family 6 member 5
Chromosome: 16; Location: 16q12.2
Annotation: Chromosome 16 NC_000016.10 (55655928..55706192)
MIM: 163970
ID: 6530

What kind of analysis are you planning to do? Depending on that you should probably include the location information to account for synonyms.

Synonyms are previously approved HGNC names: https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/HGNC:11048

ADD REPLY
0
Entering edit mode

I'm with you that the gene synonyms should point to the same genome location for a given genome build. That's why I'm curious when I downloaded the ENSEMBL gene annotations, I'm getting different ENSEMBL ID depending if NET1, SLC6A5, and NAT1 are in the 'Gene name' vs. 'Gene Synonym' column.

ADD REPLY
0
Entering edit mode
3.2 years ago
Emily 24k

Synonyms are just other names that the gene is/has been known by and has at some point been referred to as in the literature. They are there to allow for you to search based on something in the literature and find the right gene. By definition, they are inconsistent, as they reflect the changes in our understanding of what's what. Sometimes genes get given the name that was previously assigned to another gene, so the other gene will maintain that name as a synonym.

For example, we have a human gene which we call ABC1 because we think it's an orthologue of mouse Abc1. Further analysis identifies another gene in both species of the same gene both species, but it becomes apparent that the new human gene is actually the direct orthologue of the old mouse gene Abc1, and the new mouse gene is the direct orthologue of the old human gene ABC1.

So we need to rename one of the old genes. The new mouse gene becomes Abc2, the old human gene is renamed from ABC1 to ABC2, the old mouse gene stays as Abc1 and the new human gene is named ABC1. This means that the human now has a gene called ABC1, and another gene ABC2, which has the synonym ABC1.

This is just one example of how this happens.

ADD COMMENT

Login before adding your answer.

Traffic: 2989 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6