I used to do the GO analysis using these R bioconductor packages: biomaRt, clusterProfiler
First, I need to build the GO map of the bacteria genome.
Then, use the clusterProfiler to finish the enrichment analysis. (GO enrichment, KEGG enrichment ...)
My problem is: biomaRt does not support bacteria genomes anymore. so that I can not get the GO db for the rest analysis.
The following are my * R scripts** that worked in 2012.
R code:
# load libraries
library(clusterProfiler)
# Build specific GO map using GFF file
library(biomaRt)
Gff2GeneTable("NC_000962.gff")
load("geneTable.rda")
Mtb <- useMart(biomart="bacteria_mart_16",
dataset="myc_30_gene")
gomap <- getBM(attributes=c("entrezgene", "go_accession"),
filters="entrezgene",
values=geneTable$GeneID,
mart=Mtb)
#dim(gomap)
#head(gomap)
buildGOmap(gomap)
# Load the genes (differentially expressed, or other)
input_genes <- read.table("input_list.txt") # geneName geneID
input_IDS <- as.character(input_genes$geneID)
GOe <- enrichGO(input_IDs, organism = "H37Rv", ont = "BP",
pvalueCutoff = 0.05, qvalue = 0.1, readable = TRUE)
# make a plot
p1 <- plot(GOe, type = "bar", order = TRUE, showCategory = 15)
print(p1)
# write the results to a file
write.table(summary(GOe), file = "out_GOenrichment.txt", sep = "\t")
Thanks dago,
I am trying quickGO to retrieve the GO annoataion of the genes, then use the R packages for the GO enrichment analysis.