Hi,
This problem is in R/Bioconductor, topGO and GOstats packages. I want to perform HyperGeometric test for OVER representation against GO and KEGG. I have go two text files locally available: Back.txt (background with only entrez id and more then 2000 ID's- Illumina HT12 v3. microarray) and **genes.txt** (differently expressed genes- Illumina HT12 v3. microarray). The result should be in R data.frame with following fileds (GO_term_id, Go_term_name, p_value and number of associated genes from my genes list(Genes.txt file)) and visualizing in the Gograph or KEGG pathway.
I have perform hyperGTest and got the result but I am not sure whether the generated result is correct or not as per my OVER represented need against GO and KEGG. How can I check my result with other databases for consistency? I have tried with DAVID but couldn't figure out.
I have following code
library(topGO)
library(GOstats)
universe=read.table("Back.txt", sep=",") # Background files where only entrez id's are listed without heading column
tbl <- read.table ("genes.txt", sep=",") # selected genes with following header Probes_id,entrez_gene_id,symbols,P.Value and F.C
selected=<-tbl$V2 # Selecting only second column of tbl vector where entrez_gene_id is present
param<- new ("GOHyperGParams", geneIds = selected,
universeGeneIds=universe, annotation="org.Hs.eg.db",
ontology="BP",pvalueCutoff=0.1, conditional=FALSE,testDirection="over")
hyp<-hyperGTest(param)
summary(hyp)
For the trial I wrote summary(hyp) which gives me GOBPID, Pvalue, OddsRatio,ExpCount,Count, Size and Term.
But I want to get result in data.frame with following fields consisting GO_term_id, Go_term_name, p_value and number of associated genes from my genes list(Genes.txt file)
I have provided the genes.txt sample link for convinence and Back.txt contains only entrez_id
*Note- This is a follow up question.
Your question is unclear. You appear to be doing your test against the Biological Process GO field in the annotation db. So why would this return the KEGG pathway in any subsequent data frame? If you want to extract the information from the annotation database to integrate with the data, then you need to learn to look stuff up in the annotation package. The data summary(hyp) returns all the relevant information for your query, just as analysing it in DAVID would.
Hi Thaman, there are only few possible reason for not getting an answer: nobody knows, nobody cares, or nobody understood the question. Mainly here its a mixture, caused by point # three. People have asked you repeatedly for a reproducible example. Something that I could paste in my R session and run it. Make the sample data downloadable (all your inputs required + genes.txt) and even ppl not totally familiar with the package might help you out. I guess I could help but if you keep ignoring comments..., so please stop whining and re-opening and put an reproducible example
My problem with this question is that it seems to be asking about 3 different things. It's very unclear what you are trying to achieve, particularly without sample data.
@ Daniel- You are right about the confusion, but I have mentioned this in my previous question. I have two options either to do it from GO or KEGG so that's why result also changes according to annotation packages I have been using. I tried to look GOstats and topGO but I din't find extracting specific information in result sections like my requirement (GO_term_id, Go_term_name, p_value and number of associated genes from my genes list(Genes.txt file)).
@ Micheal and newilfws- Okei I will edit questions again and make it more precise with sample data.
@ Daniel- You are right about the confusion, but I have mentioned this in my previous question. I have two options either to do it from GO or KEGG so that's why result also changes according to annotation packages I have been using. I tried to look GOstats and topGO but I din't find extracting specific information in result sections like my requirement (GO_term_id, Go_term_name, p_value and number of associated genes from my genes list(Genes.txt file)).
@ Micheal and newilfws- Okei I will edit questions again and make it more precise with sample data.
@ Daniel, Michael, neilfws- I have edited my questions in the most precise way. Hope now it's clear
It's highly appreciated :)
@ Michael, so will you guide me how to solve my probelm now?
If Brad hadn't been faster, I would, let us know if that solves your problem.