Hi All :)
I have a large list of unique ENSEMBL IDs (48,432) representing mouse genes, my goal only is to get attributes like gene name and ncbi id to build a dataset with expression data. However when using the biomaRt package in R, I only get 44,980 results, and the 3,000+ ENSEMBLE IDs do not have the information I need.
Here are some ENSEMBLE IDs that I noticed on the ENSEMBL page that appear as deprecated and no longer belong to the ENSEMBL database:
ENSMUSG00000000325 \ ENSMUSG000000000004613 \ ENSMUSG00000011052 \ ENSMUSG00000021745 \ ENSMUSG00000021867
Is there any explanation why these IDs are not associated with others in the current version of ENSEMBL?
Thank you for your attention and comments :)
You should set your biomaRt version to the ensembl version used when generating the original ID list, or rerun your analysis with a newer assembly/annotation version. There are numerous reasons why IDs are updated as assemblies and annotation improve, so mismatches between versions are expected and normal.
Thanks for your answer,
I will look for the release date of the original dataset to relate it to the version of ensembl of that date, and I will try again the annotation.