Entering edit mode
5.1 years ago
R
▴
30
hello everyone:
I have a file including different accession numbers and i want to convert them into the "Gene symbols". i wonder if anyone could help me by telling me " a code of R" or " Database"
a part of my accession numbers are like below:
Q84X92_HORVD;
A2WZ82_ORYSA;
A2YGW3_ORYSA;
NP_001049423.1;
NP_001049301.1;
NP_001058077.1;
NP_001046629.1;
A2YBX9_ORYSA;
NP_001053139.1;
A2ZEJ1_ORYSA;
A2Z234_ORYSA;
NP_001063728.1;
RS12_HORVU;
NP_001051718.1;
NP_001047741.1;
NP_001067361.1;
NP_001065747.1;
You seem to have a mixture of ID types (e.g. Q84X92_HORVD is from UniProt, NP_001049423.1 is from RefSeq) from different organisms. The way I've dealt with this situation was to write a perl script using the Ensembl API (assuming species are represented in Ensembl)
You could use BioMart either from a web site or from the biomaRt R package but then you may have to go one species at a time.
EDIT: You may want to remove the version number (i.e. the dot and the following digit e.g. .1) as some tools may not recognize it.