Hello,
Lets say I did an rna-seq experiment experiment with control and drug, I am interested in KEGG pathways that sig. respond to drug.
I do KEGG analysis with the kegga
function (https://tinyurl.com/y7eszpdo)
by passing a list of genes sig. affected by drug as the de
argument, indicating my species with the species
argument, then passing a list of all detected genes as the universe
argument. From this I get my list of sig. affected pathways no problem.
What I don't get is how significance of pathway expression is determined and I have had a hard time googling the answer or finding a formula
I am most interested in understanding how the number of sig. affected genes affects the significance of pathway expression.
For instance: If I have 100 genes sig. affected by the drug and 50 of the belong to my favorite KEGG pathway. Would the expression of my favorite pathway be more significant than the scenario where I have If I have 1000 genes sig. affected by the drug and 50 of the belong to my favorite pathway.
Thank you for taking the time to read this.
i think it is mostly to do with hypergeometric test just the way GO term enrichment is.