budgerigar conversion id from gene names to ensembl ids
1
0
Entering edit mode
19 months ago
voukalie • 0

Hello! I am writing to you as I am trying to solve the following problem: I have a list of gene names from the species Budgerigar (Melopsittacus undulatus) and my ultimate aim is to find the human orthologs in gene names and entrez ids. This can only be done in the archived version of Ensemble 101 biomart. My problem is that I cannot do it without converting the gene names to ensemble ids first, and this is not possible unless I do it one by one. I have thousands of gene manes so I am looking for a quicker solution, ideally in r. The script I used is:

bmchrpltest=c("LOC101867983","LOC101867991","LOC101868021")
getBM(attributes=c('ensembl_gene_id', 'external_gene_name'), 
      filters = 'external_gene_name', 
      values = bmchrpltest, 
      mart = ensembl101) 

and I get this result:

[1] ensembl_gene_id    external_gene_name
<0 rows> (or 0-length row.names)

Do you have any advice on how to proceed?

Eleni

conversion id • 1.1k views
ADD COMMENT
0
Entering edit mode

My full code:

library("biomaRt")
ensembl101=useMart(host='https://aug2020.archive.ensembl.org', biomart='ENSEMBL_MART_ENSEMBL', dataset='mundulatus_gene_ensembl')

bmchrpltest=c("LOC101867983","LOC101867991","LOC101868021")
getBM(attributes=c('ensembl_gene_id', 'external_gene_name'), 
      filters = 'external_gene_name', 
      values = bmchrpltest, 
      mart = ensembl101)
ADD REPLY
0
Entering edit mode

This can only be done in the archived version of Ensemble 101 biomart

Can you explain why?

ADD REPLY
0
Entering edit mode

Thank you for your answer! I am not that familiar with EntrezDirect although it seems a useful skill to have. I can only do it in Ensembl 101 because the dataset for this species is removed from its more recent versions.

ADD REPLY
0
Entering edit mode

In case you do not get an answer for BioMart. One way to do this would be using EntrezDirect:

$ esearch -db gene -query LOC101867983 | esummary | xtract -pattern DocumentSummary -element Name,Description,OtherAliases
LOC101867983    RAN binding protein 1   RANBP1

This basically gives you an idea of what the gene is. You can then find the human version.

ADD REPLY
1
Entering edit mode
19 months ago
Corentin ▴ 610

Hi,

You can use entrezgene_id as the filter instead of external_gene_name. But you need to remove "LOC" from the ids as well:

library("biomaRt")

ensembl101 = useMart(host = "https://aug2020.archive.ensembl.org", 
                     biomart = "ENSEMBL_MART_ENSEMBL", 
                     dataset = "mundulatus_gene_ensembl")

bmchrpltest = c("101867983","101867991","101868021")

getBM(attributes = c("entrezgene_id", "ensembl_gene_id", "external_gene_name"), 
      filters = "entrezgene_id",
      values = bmchrpltest, 
      mart = ensembl101)

This code returned this output:

  entrezgene_id    ensembl_gene_id external_gene_name

     101867983 ENSMUNG00000013288             RANBP1
     101867991 ENSMUNG00000012298              NEGR1
     101868021 ENSMUNG00000011145             EEF1B2
ADD COMMENT
0
Entering edit mode

This worked! Thanks:)

ADD REPLY
0
Entering edit mode

Please accept this answer (green checkmark) to provide closure for this thread.

ADD REPLY

Login before adding your answer.

Traffic: 2302 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6