Ensembl ID mapping using Biomart
1
0
Entering edit mode
4.4 years ago

I mapped the Human Ensembl IDs using Biomart package in R. It gave me a one-to-one mapping for Ensembl gene ID to External gene ID. Is this Ensembl gene ID different from Original Ensembl IDs which I had initially in my files?

The original Ensembl IDs had a decimal after the number ends like "ENSG00000280109.1" but on the same hand, its corresponding Ensembl gene ID ("ENSG00000280109") is decimal free which I gave as values in Biomart code. Please explain what is the difference between these two?

And secondly, UCSC Xena browser gives us a list of Ensembl IDs and their corresponding mapped gene symbols, but there are many cases in which the same gene corresponds to different Ensembl IDs.

Which one should I trust? Mapping via Biomart or the mapping file provided by XENA itself.

On what criteria is Biomart assigning a single gene to each Ensembl gene ID/original Ensembl IDs whereas UCSC Xena provides different Ensembl IDs corresponding to the same gene.

RNA-Seq R software error gene • 1.5k views
ADD COMMENT
0
Entering edit mode

Here you can find the answer to your first question about the decimal number in the Ensembl IDs.

ADD REPLY
1
Entering edit mode
4.4 years ago

Please explain what is the difference between these two?

The number after the dot indicates a version number and can usually be safely ignored. Actually some software may not recognize Ensembl IDs with version.

there are many cases in which the same gene corresponds to different Ensembl IDs.

Most likely because of different versions. Check which version of Ensembl each of your tools is using. Avoid mixing versions, just pick one and stick to it.

On what criteria is Biomart assigning a single gene to each Ensembl gene ID/original Ensembl IDs

By definition a gene ID in Ensembl corresponds to a single gene. A gene in Ensembl is defined by annotating a reference genome and thus is associated with a chromosomal region. Ensembl maintains its own mapping of external IDs to Ensembl IDs which is what Biomart is using.

whereas UCSC Xena provides different Ensembl IDs corresponding to the same gene.

Probably because Xena is either using a different version of the Ensembl genome annotations and/or is using a different mapping between Ensembl and non-Ensembl IDs then the one used by Ensembl/Biomart.

What you need to understand is that there is no single definition of a gene and each resource has its own.

ADD COMMENT

Login before adding your answer.

Traffic: 1940 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6