Hello,
I am attempting to use topGO to perform GO enrichment analysis on a list of differential accessible genes. I have p.vales that are associated with each of those genes, but I am not sure on how to appropriately incorporate those values during data preparation such that they are utilized during GOterm enrichment analysis. Below is the code I am using currently to perform the analysis independent of p.values. The p values are also contained within the the dataframe "anno_P23_Male_vs_P23_Female.EDGER" in the same row as the gene to which they correspond. I imagine the solution ism pretty simple, but after thoroughly reading through the vignettes I am still struggling.
ensembl <- useMart("ensembl")
#Establish the mart to utilize
mart <- useMart("ensembl", dataset ="tguttata_gene_ensembl")
# Get ensembl gene ids and GO terms
GTOGO <- biomaRt::getBM(attributes = c( "ensembl_gene_id",
"go_id"), mart = mart)
#examine result
head (GTOGO)
#Remove blank entries
GTOGO <- GTOGO[GTOGO$go_id != '',]
# convert from table format to list format
# now we have a list genes and the go_ids that they have associated with them
geneID2GO <- by(GTOGO$go_id,
GTOGO$ensembl_gene_id,
function(x) as.character(x))
# check out the list of result --- this is all of the genes that have go annotation information associated with them
head (geneID2GO)
# generates a sorted list of all of the unique genes that we have
all.genes <- sort(unique(as.character(GTOGO$ensembl_gene_id)))
P23_Male_vs_P23_Female.EDGER <- factor(as.integer(all.genes %in% anno_P23_Male_vs_P23_Female.EDGER$geneId))
names(P23_Male_vs_P23_Female.EDGER) = all.genes
P23_Male_vs_P23_Female.EDGER.GO.Obj <- new("topGOdata", ontology='BP', allGenes = P23_Male_vs_P23_Female.EDGER, annot = annFUN.gene2GO, gene2GO = geneID2GO)
I hope that this explanation makes sense. If further detail would be helpful, I am obviously more than happy to provide. Any assistance in addressing this question would be greatly appreciated.
All the Best!