Hi,
In our RNA-seq pipeline we have a step after mapping that converts ensembl gene ID's (e.g., ENSMUSG00000096126) to gene names. When the pipeline was built we mapped read files to older versions of the mouse genome.
What happens when new samples mapped on the newest version of the murine genome is run trough the pipeline? Will we just miss the gene names of the transcripts that were not included in the old built of the murine genome? Are there any additional consequences?
Thank you for your time,
Annika
I am not sure I understand the problem here. If you have transcript or gene IDs for a given version then use this version to retrieve the gene names, e.g. if you map your reads to e.g. Ensembl v82 then use the v82 BioMART or API to retrieve corresponding gene names. Now if you're trying to use transcript or gene IDs from an older version to find gene names in a newer one, you may indeed have problems such as the IDs not being present in the new version anymore but I don't think you should be doing this. If accurately identifying genes in a new version of Ensembl is critical you should probably remap all your data to that version.