It's fairly trivial to check the significance of overlap between 2 gene lists with a Fisher's exact test in R (fisher.test()
), which is widely accepted. Is there a good alternative that would also be able to incorporate ranks or weights for each gene?
You want a number which measures overlap between two list but accepting weight for each gene? for each gene list? Why do you want it? Maybe there is another problem you want to solve the hard way...
To my knowledge you cannot unless you are doing something on the lines of over-representation of gene set which ideally needs a weight and that is what is used n gene set enrichment or GO. While comparing 2 universe of genes does not really stands out with any other attributes associated to it. So its a simple hypergeomtric test. However if you are adding attirbutes of length or some bias in the gene then you will have to also perform some other test which will reject your null hypothesis. So to my knowledge it will not be ideal unless you are doing some enrichment based gene set or over-representation analysis.
Sure. Maybe what I am asking for is over-representation. Is there a simple way to do that?
If your gene lists depend on an arbitrary cutoff, then weights would be a good compromise. For example, you can take top 100 genes or top 500 genes. If you take top 500, the top 100 still have more confidence, so they should count for more.
For example, GSEA can do analysis for a full ranked gene list, but that's a full standalone package. I just want a more simple function that I could integrate into my workflow.
GSEA can be performed in a single function, see function fgsea of
fgsea
package in Bioconductor, or to consider the values of the list you can have a look atroast
function of limma package. See this recently post about different GSEAIf you select less genes but the same pathway is enriched, it will have a lower p-value. But the confidence is the same. Precisely in over-representation test one important aspect is to consider what is the background population you are considering.