Hi,
I have a huge list of gene ids which are 0610005C13Rik, 0610007P14Rik etc in mouse RNA-Seq data. I want to convert it into either Refseq id or common gene symbol in R using Bioconductor package. I'm new to this and open to trying new ways.
Hi,
I have a huge list of gene ids which are 0610005C13Rik, 0610007P14Rik etc in mouse RNA-Seq data. I want to convert it into either Refseq id or common gene symbol in R using Bioconductor package. I'm new to this and open to trying new ways.
You can use NCBI Datasets for this. Go to the NCBI Data Tables page and upload a file containing the list of names of the genes. Choose 'mouse' as the organism. The output table can be modified to include/exclude specific columns of interest to you and downloaded as a tab-delimited file.
Alternatively, you can use the command line tool to obtain the data in json format and parse it using a tool like jq
to extract fields of interest.
This is the method I use. First download
https://ftp.ncbi.nlm.nih.gov/gene/DATA/GENE_INFO/Mammalia/Mus_musculus.gene_info.gz
Then
> library(limma)
> Aliases <- c("0610005C13Rik", "0610007P14Rik")
> GeneAnnotation <- alias2SymbolUsingNCBI(Aliases, "Mus_musculus.gene_info.gz")
> GeneAnnotation
GeneID Symbol description
15939 71661 0610005C13Rik RIKEN cDNA 0610005C13 gene
9914 58520 Erg28 ergosterol biosynthesis 28
Note that the first Gene ID you give is already the common gene symbol. The second ID you give is an alias for Erg28.
The GeneID column in the above table is the NCBI Entrez Gene ID.
biomartR or biomaRt may be your choice, or you can use biomart website (from ensembl) to download data and convert by yourself! Also there are other off/online covert tools, like DAVID.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Looks like
data tables
is not accepting the ID's that user provided above.But I just checked, they seem to be working fine. I used the "Enter identifiers manually" option, copy/pasted the identifiers (one per line), chose the identifier type as "gene symbol" and organism as "mouse". And I see a results table. Do you see any error messages?
I see it now. It is not intuitive that one can replace the default
human
with a different name/species.human
has to be replaced manually with something else and return key pressed before possible options show up. Would be great if a down-arrow could be shown in the second box to indicate that additional options are available.