What is a good way to get all Gene Ontology terms for a certain species in R?
0
1
Entering edit mode
5.1 years ago
lihe.liu ▴ 30

Hi community,

I wonder what is a good way/package to get all the GO terms for a certain species in R.

I work with Bos taurus, and I tried Ensembl and org.Bt.eg.db database, however, they give me quite a different number of GOs.

Seems that Ensembl has way more GOs than org.Bt.eg.db.

org.Bt.eg.db has it's unique ones though.

library(biomaRt)
library(org.Bt.eg.db)
database = useMart("ensembl")
genome = useDataset("btaurus_gene_ensembl", mart = database)
gene = getBM(attributes = c("ensembl_gene_id","go_id","name_1006"),mart = genome)

# all the go from biomart
all_go1 = unique(na.omit(gene$go_id))[-1]
length(all_go1) # total 15118

# all the go from org.Bt.eg.db
all_go2 = AnnotationDbi::keys(org.Bt.eg.db,keytype = c("GO"))
length(all_go2) # total 9032

# intersect
table(all_go2 %in% all_go1)
table(all_go1 %in% all_go2)

Thank you so much!

Best.

R gene ontology • 1.0k views
ADD COMMENT
0
Entering edit mode

The difference between Ensembl and org.Bt.eg.db has already been explained here. Because of these differences, it is usually not recommended to mix references. Just pick one or the other for your project and stick to it. If you start mixing references, you'll get into various kinds of troubles down the line.

ADD REPLY

Login before adding your answer.

Traffic: 2603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6