Entering edit mode
2.1 years ago
Adrian Pelin
★
2.6k
I have protein interaction data telling me which host preys/proteins are targeted by a virus. I then have additional data which divides the list of targeted proteins into those responsible for a phenotype (n=319) those that aren't contributing to a phenotype (n=2118). How can I find out which pathways (ex KEGG) that are enriched in the phenotype associated gene list compared no the gene list not contributing to the phenotype? I have found a similar post here which suggests setting the union of both lists as background, do I then attempt to enrich both lists against that background? Thanks for any advice
You should be able to do a regular hypergeometric-based enrichment test with some modifications to the standard method. Your universe of genes will be the 2437 genes that interact, your success cases the 319 contributing genes, and your failure cases the 2118 non-contributing genes. You need to remove all genes in the gene sets you are testing that are not in the 2437 interacting genes too.