a problem with getBM
1
0
Entering edit mode
23 months ago
sanaz • 0

I want to add gene name information to my expression file, I use Biomart for this, but actually I did not get any result from getBM,

My code in R:

upregul<-read.csv(file.choose())
View(upregul)

library(biomaRt)

listEnsembl()
ensembl <- useEnsembl(biomart = "genes")
ensembl_ids<-(upregul)

View(ensembl_ids)

datasets <- listDatasets(ensembl)

head(datasets)

ensembl_con<-useEnsembl("ensembl",dataset="hsapiens_gene_ensembl")
attr<-listAttributes(ensembl_con)
filters<-listFilters(ensembl_con)
theBm<-getBM(attributes = c("ensembl_gene_id","external_gene_name"),
      filters = "ensembl_gene_id",
      values = ensembl_ids$X ,
      mart =ensembl_con)

The result after running the getBM:

[1] ensembl_gene_id    external_gene_name
<0 rows> (or 0-length row.names)
getBM biomart • 1.6k views
ADD COMMENT
1
Entering edit mode

Can you update your post to show some examples of what's in your ensembl_ids object? Perhaps the contents of head(ensembl_ids).

It's hard to say why this might be happening without seeing at least a small example of the data you are using.

ADD REPLY
0
Entering edit mode
head(ensembl_ids)
                     X  baseMean log2FoldChange lfcSE stat  pvalue      padj
1 ENSG00000000938.13 1344.2587      2.1075271              0.13713567 15.368191 2.675371e-53 1.886489e-51
2 ENSG00000001461.17 2326.3086      0.5394702                0.08932168  6.039634 1.544646e-09 6.948079e-09
3  ENSG00000001561.7 2340.7964      0.6140277                  0.10196834  6.021749 1.725425e-09 7.724584e-09
ADD REPLY
1
Entering edit mode

Exclude commands you use to examine the data (such as View(), head() or just typing in the object name which prints a representation of the object to the Console) when you share your code here.

ADD REPLY
0
Entering edit mode

Many thanks, it worked. but at first, I had 5000 observations and after getBM, my obs reduced to 4000.

ADD REPLY
3
Entering edit mode
23 months ago
Mike Smith ★ 2.1k

This is because your ensembl IDs have version numbers appended to them (that's the .13 etc at the end). In order to search BioMart with versioned IDs you have to use the "ensembl_gene_id_version" filter e.g.

library(biomaRt)

ensembl_ids_vesion <- c("ENSG00000000938.13", "ENSG00000001461.17", "ENSG00000001561.7")

ensembl_con <- useEnsembl("genes", dataset = "hsapiens_gene_ensembl")

getBM(attributes = c("ensembl_gene_id","external_gene_name"),
      filters = "ensembl_gene_id_version",
      values = ensembl_ids_vesion,
      mart = ensembl_con )

#>   ensembl_gene_id external_gene_name
#> 1 ENSG00000000938                FGR
#> 2 ENSG00000001461             NIPAL3
#> 3 ENSG00000001561              ENPP4

In that example you get 3 matches, which is great. However, sometimes the versioning can be too specific. If you happen to have an older version of an ID you won't get a hit. I normally strip the version numbers from the IDs and then run the code you originally had. One approach to do that is to use the stringr package e.g.

library(stringr)
## remove the version numbers
ensembl_ids <- str_replace(ensembl_ids_vesion,
                        pattern = ".[0-9]+$",
                        replacement = "")

## search BioMart using the non-versioned ensembl IDs
getBM(attributes = c("ensembl_gene_id","external_gene_name"),
      filters = "ensembl_gene_id",
      values = ensembl_ids ,
      mart = ensembl_con )

#>   ensembl_gene_id external_gene_name
#> 1 ENSG00000000938                FGR
#> 2 ENSG00000001461             NIPAL3
#> 3 ENSG00000001561              ENPP4
ADD COMMENT

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6