biomart returns several ensembl ids for one gene
0
0
Entering edit mode
3.6 years ago
fifty_fifty ▴ 70

I have to convert the gene names in my scRNA-seq data into ensembl IDs for downstream analyses. I used biomaRt package which converted some of the gene names:

library(biomaRt)
ensembl = useMart("ensembl",dataset="hsapiens_gene_ensembl")

biomart_hgnc <- getBM(attributes = c("hgnc_symbol", "ensembl_gene_id"), 
                     filters = "hgnc_symbol", 
                     values = rownames(LeeCRCtumor), bmHeader = T, mart = ensemble)

However, it returns several ensemble ids for one gene like here:

enter image description here

should I specify the gene location/chromosome in this case?

scrna-seq ensembl biomart r • 2.0k views
ADD COMMENT
0
Entering edit mode

yes, I understand that one gene can have several ensemble ids. But in my case, I have this single-cell RNA seq count matrix with gene names which I got from NCBI database. I don't have any fastq files or anything raw. I am trying to find a way to convert the gene names to ensemble ids. So, I think I need to restrict the biomaRt mapping somehow that the genes should not be in haplotypic regions. I was wondering if biomaRt has that functionality.

ADD REPLY
0
Entering edit mode

For the two examples above:

197953 is the main gene.
261846 is the alternate sequence gene.

So you could filter your lists to restrict genes on main chromosome.

ADD REPLY
0
Entering edit mode

yes, I filtered out the genes that are not on the main chromosomes. I used ensembldb and several filters of biomaRt subsequently. However, I have some remaining genes that were not recognized by those methods. A lot of them start with RP11. I couldn't find some of them at all, e.g. CH17-212P11.4. Do you know how to convert them into ensemble id?

ADD REPLY

Login before adding your answer.

Traffic: 1920 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6