I'm trying to do gene set enrichment analysis using the set of constrained genes from ExAC (pli >= 0.9 || mis-z >= 0.3) as a reference list (attached), as I did not look at missense in genes with mis-z < 0.3 or undefined and I did not look at PTVs in genes with pli < 0.9 or undefined in my analysis. That's 10328 genes. None of my earlier enrichments are significant.
There's a problem here though. If your reference list only includes 3 genes that map to a process and the expectation is to observe 0.16 genes then even if you see those 3 genes (18.38 fold enrichment) the p-value is still reported as 1. So I have a lot of greater than 2X fold enrichments, for example, but no significant p-values. In fact, virtually all my p-values are 1.
Is there a better way to go about doing this.I was using the service at http://geneontology.org/
Thanks, Robert
Hi Robert,
1) We cannot see (or at least, I can't) the attached list
2) 3 is a really small number, and the lack of statistical significance for that comparison is expected. Enrichment analysis are more informative for Ontology terms containing relatively large numbers of genes