Finding Gene Name Synonyms
2
2
Entering edit mode
5.9 years ago
Ark ▴ 90

Hello again,

I was curious if anyone had a nice way of finding the gene synonyms for a list of genes? My genes are currently listed as gene symbols, however, any tool or method to fetch synonyms from any type of ID would be a step in the right direction. Conversions in the reverse direction would also be helpful (from any known synonym to gene symbol).

As an example: If I came across IL8 (a synonym), I would convert it to CXCL8 (the gene symbol), or vice-versa. I know in ENSEMBL (and many other tools) you can search for genes using many synonymous names, so there must be a table somewhere to easily covert between the two, but I have not been able to locate it.

Currently, I am using a massive table from UniProt and using 'awk' to search for the genes I need. This is both cumbersome and inefficient, which is why I wanted to reach out to you all!

Thanks!

R gene • 3.4k views
ADD COMMENT
6
Entering edit mode
5.9 years ago

using mysql / ucsc

~$ mysql --user=genome --host=genome-mysql.soe.ucsc.edu -A -P 3306 -D hg38 -e 'select distinct K2.alias from kgAlias as K1, kgAlias as K2 where K1.alias="IL8" and K1.kgID=K2.kgID;'
+-------------------+
| alias             |
+-------------------+
| B2R4L8            |
| CXCL8             |
| ENST00000307407.7 |
| IL8               |
| IL8_HUMAN         |
| NM_000584         |
| P10145            |
| Q6FGF6            |
| Q6LAE6            |
| Q96RG6            |
| Q9C077            |
| Q9UCE1            |
| Q9UCR8            |
| Q9UCR9            |
| Q9UCS0            |
| uc003hhe.1        |
| uc003hhe.2        |
| uc003hhe.3        |
| C9J4T6            |
| C9J4T6_HUMAN      |
| ENST00000401931.1 |
| LP896310          |
| uc062xgp.1        |
+-------------------+
ADD COMMENT
0
Entering edit mode

Thank you very much Pierre! This was exactly what I was looking for!

ADD REPLY
2
Entering edit mode
5.9 years ago
cmdcolin ★ 4.0k

You might also consider mygene.info

curl "http://mygene.info/v3/query?q=CXCL8&species=human&fields=all"

If you have jq, you can sort of slice and dice the results. This chooses the top hit (make sure to validate this in whatever case you are in) and picks fields from it

Pick some aliases

curl "http://mygene.info/v3/query?q=CXCL8&species=human&fields=all"|jq ".hits[0].alias"
[
  "GCP-1",
  "GCP1",
  "IL8",
  "LECT",
  "LUCT",
  "LYNAP",
  "MDNCF",
  "MONAP",
  "NAF",
  "NAP-1",
  "NAP1"
]

Pick some ensembl info

curl "http://mygene.info/v3/query?q=CXCL8&species=human&fields=all"|jq ".hits[0].ensembl"
{
  "gene": "ENSG00000169429",
  "protein": [
    "ENSP00000306512",
    "ENSP00000385908"
  ],
  "transcript": [
    "ENST00000307407",
    "ENST00000401931",
    "ENST00000483500"
  ],
  "translation": [
    {
      "protein": "ENSP00000306512",
      "rna": "ENST00000307407"
    },
    {
      "protein": "ENSP00000385908",
      "rna": "ENST00000401931"
    }
  ],
  "type_of_gene": "protein_coding"
}

Pick some crazy accessions

curl "http://mygene.info/v3/query?q=CXCL8&species=human&fields=all"|jq ".hits[0].accession"
{
  "genomic": [
    "AC112518.1",
    "AF385628.2",
    "CH471057.1",
    "D14283.1",
    "HC462818.1",
    "HI519526.1",
    "M28130.1",
    "NC_000004.12",
    "NG_029889.1"
  ],
  "protein": [
    "AAA35611.1",
    "AAA36323.1",
    "AAA59158.1",
    "AAH13615.1",
    "AAK60276.1",
    "AAP35730.1",
    "BAA03245.1",
    "BAG34815.1",
    "CAA68742.1",
    "CAA77745.1",
    "CAG46948.1",
    "CBK51144.1",
    "CBX54344.1",
    "EAX05687.1",
    "EAX05688.1",
    "EAX05689.1",
    "NP_000575.1",
    "NP_001341769.1",
    "P10145.1"
  ],
  "rna": [
    "AJ227913.1",
    "AK131067.1",
    "AK311874.1",
    "BC013615.1",
    "BG497712.1",
    "BT007067.1",
    "CR542151.1",
    "DA671239.1",
    "M17017.1",
    "M26383.1",
    "NM_000584.4",
    "NM_001354840.1",
    "Y00787.1",
    "Z11686.1"
  ],
  "translation": [
    {
      "protein": "BAG34815.1",
      "rna": "AK311874.1"
    },
    {
      "protein": "AAA35611.1",
      "rna": "M17017.1"
    },
    {
      "protein": "NP_000575.1",
      "rna": "NM_000584.4"
    },
    {
      "protein": "CAA68742.1",
      "rna": "Y00787.1"
    },
    {
      "protein": "CAA77745.1",
      "rna": "Z11686.1"
    },
    {
      "protein": "NP_001341769.1",
      "rna": "NM_001354840.1"
    },
    {
      "protein": "AAA36323.1",
      "rna": "M26383.1"
    },
    {
      "protein": "AAH13615.1",
      "rna": "BC013615.1"
    },
    {
      "protein": "CAG46948.1",
      "rna": "CR542151.1"
    },
    {
      "protein": "AAP35730.1",
      "rna": "BT007067.1"
    }
  ]
}

Pick some uniprot info

curl "http://mygene.info/v3/query?q=symbol:CXCL8&species=human&fields=all"|jq ".hits[0].uniprot"

{
  "Swiss-Prot": "P10145",
  "TrEMBL": [
    "C9J4T6",
    "A0A024RDA5"
  ]
}
ADD COMMENT

Login before adding your answer.

Traffic: 1882 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6