In a typical ChIP-Seq experiment, We found a transcription factor "ABC" peaks on 590 genes. 242 genes out of the 590 genes are classified as "ORFs". If the number of "ORFs" contained in the genome is 5420 and the total number of genes in the genome is 6226. are the total "ABC" bound genes enriched in ORFs? Whether "ABC" association at ORFs is by chance? Which statistical test should be used? Can it be done in R and how? is there any other way to do this?
I got to know from one forum that test for enrichment of gene lists is to do a hypergeometric test or, equivalently, a one-sided Fisher's exact test. Though I am not very familiar with R, based on other examples I tried to use R for Fisher's Exact Test for count data and my output is like this.
> fisher.test(matrix(c(242,5178,348,458),nrow=2,ncol=2),alternative="greater")
Fisher's Exact Test for Count Data
data: matrix(c(242, 5178, 348, 458), nrow = 2, ncol = 2)
p-value = 1
alternative hypothesis: true odds ratio is greater than 1
95 percent confidence interval:
0.05224986 Inf
sample estimates:
odds ratio
0.06156519
My analysis, If correct suggests that factor "ABC" is present on ORFs only by chance. If the analysis is right what should be the conclusion? Is the above conclusion right? Please help.
thank you very much for your suggestions. I will do the needful.