Hi All,
I need to perform a Fisher's exact test to evaluate the overlap between the drug targets obtained from a database and a list of genes from coexpression analysis. However, I am unsure how to calculate the size of the universe. It should be the genome size or total no. of genes, but different experiments have different gene set sizes, like microarrays have different probe set sizes. Can I simply use 22,000 genes in the human genome? Or should I set it to the total no. of genes from that particular experiment? Another thing I am confused about, is that from the coexpressed gene modules there are only around 5000 genes out of the 22,000 genes that are assigned to the modules. I test the overlap betwee the drug targets and the genes belonging to each coexpression module. Do I need to incorporate the 5000 genes somehow also?
Thanks for your help!