I encountered some problem while using bitr() for gene ID transversion.
1
0
Entering edit mode
4 weeks ago
Pallondyle • 0

Some ENSEMBL IDs seem to match more than one gene ID. It doesn't make sense. My code:

## Merge gene ID columns
gene_entrezid <- bitr(geneID = rownames(P_change), 
                      fromType = "ENSEMBL", 
                      toType = "SYMBOL", # ENTREZID transversion
                      OrgDb = "org.Mm.eg.db"
)
ID_bind<-function(a){
    a2<-cbind(a,rownames(a))
    colnames(a2)[7]<-"ENSEMBL"
    a2<-merge(a2,gene_entrezid,by="ENSEMBL",all.y =F)
    rownames(a2)<-a2$ENSEMBL
    a2<-a2[,-1]
    return(a2)
}
P_change<-ID_bind(P_change)

Error:

#Error in `.rowNamesDF<-`(x, value = value) : 
#  duplicate row.names are not allowed
#In addition: Warning message:
#non-unique values when setting 'row.names': ‘ENSMUSG00000000486’, ‘ENSMUSG00000000562’, ‘ENSMUSG00000001768’, #‘ENSMUSG00000002250’, ‘ENSMUSG00000003271’, ‘ENSMUSG00000003680’, ‘ENSMUSG00000003812’, ‘ENSMUSG00000004455’, ‘ENSMUSG00000005983’, ‘ENSMUSG00000015290’, ‘ENSMUSG00000015341’, ‘ENSMUSG00000015882’, ‘ENSMUSG00000018378’, ‘ENSMUSG00000019865’, ‘ENSMUSG00000019868’, ‘ENSMUSG00000021557’, ‘ENSMUSG00000021846’, ‘ENSMUSG00000021983’, ‘ENSMUSG00000022820’, ‘ENSMUSG00000023156’, ‘ENSMUSG00000024571’, ‘ENSMUSG00000025194’, ‘ENSMUSG00000025646’, ‘ENSMUSG00000027022’, ‘ENSMUSG00000028700’, ‘ENSMUSG00000029089’, ‘ENSMUSG00000029592’, ‘ENSMUSG00000029723’, ‘ENSMUSG00000030337’, ‘ENSMUSG00000031167’, ‘ENSMUSG00000032750’, ‘ENSMUSG00000032872’, ‘ENSMUSG00000035171’, ‘ENSMUSG00000036381’, ‘ENSMUSG00000037747’, ‘ENSMUSG00000038209’, ‘ENSMUSG0000004019 [... truncated]

I tried to search for corresponding gene information on the official website, but strangely, some genes with completely different ENSEMBL IDs can still be found by searching for these gene IDs.

enter image description here

I am currently confused about two things: Firstly, how did this happen? Secondly, if I still want to associate gene IDs with transcriptome data annotated with ENSEMBL, what is a reasonable approach?

bioMart • 352 views
ADD COMMENT
0
Entering edit mode
4 weeks ago
Pallondyle • 0

I solved this problem by downloading data directly from the ENSEMBL website. But I am still curious about how such a thing happened.

ADD COMMENT
1
Entering edit mode

The top result in your example is the gene summary entry for ENSMUSG00000035171 - the Ensembl gene id for the gene.

The other result in your example is a specific link to the gene tree view for ENSMUSG00000035171

As for why duplicate results are returned, you can examine the duplicate rows for ENSMUSG00000035171 in your gene_entrezid table to see what the discrepancy is. But it will likely be down to changes in annotation over verions of different annotation types.

ADD REPLY
0
Entering edit mode

Thx, I will check it later

ADD REPLY

Login before adding your answer.

Traffic: 2140 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6