Ensembl gene ID conversion to gene name
2
1
Entering edit mode
9.1 years ago

Hi,

In our RNA-seq pipeline we have a step after mapping that converts ensembl gene ID's (e.g., ENSMUSG00000096126) to gene names. When the pipeline was built we mapped read files to older versions of the mouse genome.

What happens when new samples mapped on the newest version of the murine genome is run trough the pipeline? Will we just miss the gene names of the transcripts that were not included in the old built of the murine genome? Are there any additional consequences?

Thank you for your time,

Annika

RNA-Seq reference-genome • 4.1k views
ADD COMMENT
1
Entering edit mode

I am not sure I understand the problem here. If you have transcript or gene IDs for a given version then use this version to retrieve the gene names, e.g. if you map your reads to e.g. Ensembl v82 then use the v82 BioMART or API to retrieve corresponding gene names. Now if you're trying to use transcript or gene IDs from an older version to find gene names in a newer one, you may indeed have problems such as the IDs not being present in the new version anymore but I don't think you should be doing this. If accurately identifying genes in a new version of Ensembl is critical you should probably remap all your data to that version.

ADD REPLY
1
Entering edit mode
9.1 years ago
Michael 55k

You should always make sure that the versions of genome built/assembly and annotation are consistent. I don't exactly understand which part of your pipeline is not updated or why, but I suggest to either update everything or nothing. For a comparative analysis I would re-map everything against the latest assembly and gene-models.

Also I wouldn't use gene names, with which you mean gene symbols I guess, except for additional final annotation. Ensembl gene IDs don't change (mostly, except that they could become deleted, or added) and are unique while gene names are ambiguous and may change.

ADD COMMENT
1
Entering edit mode
9.1 years ago
Abdullah ▴ 100

Have a look at http://mygene.info/. It is an efficient way to make the conversion between Gene ID formats. It can be implemented inside a pipeline as well.

ADD COMMENT

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6