Zebrafish Ensembl gene ID to RefSeq protein accession?
1
0
Entering edit mode
4.5 years ago
gpreising ▴ 10

Hi,

I am trying to convert zebrafish Ensembl gene IDs to RefSeq protein accessions. I did this partially with BioMart, but only got ~12k RefSeq accessions for the ~22k Ensembl IDs I queried. Are there any other tools available that I could use to convert the remaining ~10k Ensembl gene IDs, ideally ones that folks have used for zebrafish? From my google searches I can see that most of these are optimized for rat/mouse/human.

Thanks,
Gabe

BioMart Ensembl RefSeq • 2.4k views
ADD COMMENT
0
Entering edit mode

Please provide examples of ID's that don't match. Using Entrezdirect may be possible in this case to get the info you want.

ADD REPLY
0
Entering edit mode

Sure, here are a few IDs that I could get gene names for but not RefSeq accressions

ENSDARG00000061451 ENSDARG00000061749 ENSDARG00000061764

ADD REPLY
0
Entering edit mode

Would the following link help? https://david.ncifcrf.gov/conversion.jsp

ADD REPLY
2
Entering edit mode
4.5 years ago

You can try the org.Dr.eg.db package - it may match more IDs for you:

library(org.Dr.eg.db)

ens <- c('ENSDARG00000061451', 'ENSDARG00000061749',
  'ENSDARG00000061764')

keytypes(org.Dr.eg.db)

 [1] "ACCNUM"       "ALIAS"        "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS"
 [6] "ENTREZID"     "ENZYME"       "EVIDENCE"     "EVIDENCEALL"  "GENENAME"    
[11] "GO"           "GOALL"        "IPI"          "ONTOLOGY"     "ONTOLOGYALL" 
[16] "PATH"         "PFAM"         "PMID"         "PROSITE"      "REFSEQ"      
[21] "SYMBOL"       "UNIGENE"      "UNIPROT"      "ZFIN"

.

mapIds(org.Dr.eg.db, keys = ens,
  column = c('REFSEQ'), keytype = 'ENSEMBL')

'select()' returned 1:many mapping between keys and columns
ENSDARG00000061451 ENSDARG00000061749 ENSDARG00000061764 
    "NM_001079967"                 NA     "XM_005173182"


select(org.Dr.eg.db, keys = ens,
  columns = c('REFSEQ',  'ENTREZID', 'SYMBOL', 'ENSEMBL'),
  keytype = 'ENSEMBL')

'select()' returned 1:many mapping between keys and columns
              ENSEMBL       REFSEQ ENTREZID SYMBOL
1  ENSDARG00000061451 NM_001079967   558048  n4bp2
2  ENSDARG00000061451 NP_001073436   558048  n4bp2
3  ENSDARG00000061451 XM_021475000   558048  n4bp2
4  ENSDARG00000061451 XP_021330675   558048  n4bp2
5  ENSDARG00000061749         <NA>     <NA>   <NA>
6  ENSDARG00000061764 XM_005173182   559276  ahnak
7  ENSDARG00000061764 XM_005173183   559276  ahnak
8  ENSDARG00000061764 XM_005173184   559276  ahnak
9  ENSDARG00000061764 XM_005173188   559276  ahnak
10 ENSDARG00000061764 XM_009291066   559276  ahnak
11 ENSDARG00000061764 XM_017359122   559276  ahnak
12 ENSDARG00000061764 XM_021481163   559276  ahnak
13 ENSDARG00000061764 XM_021481164   559276  ahnak
14 ENSDARG00000061764 XP_005173239   559276  ahnak
15 ENSDARG00000061764 XP_005173240   559276  ahnak
16 ENSDARG00000061764 XP_005173241   559276  ahnak
17 ENSDARG00000061764 XP_005173245   559276  ahnak
18 ENSDARG00000061764 XP_009289341   559276  ahnak
19 ENSDARG00000061764 XP_017214611   559276  ahnak
20 ENSDARG00000061764 XP_021336838   559276  ahnak
21 ENSDARG00000061764 XP_021336839   559276  ahnak

Kevin

ADD COMMENT
1
Entering edit mode

Hi Kevin,

That worked! I only lost 1.2K IDs with that package. Thanks so much for the help.

ADD REPLY

Login before adding your answer.

Traffic: 2332 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6