This is probably more of a philosophical question. I'm sure you could probably choose whichever library you want (i.e. Hallmark, Reactome, Kegg etc.), but I'm curious whether there are known benefits of using one (or a few) over others. Are there certain circumstances where one (or a combination) would be better than another? For instance, I'd imagine if you are answering a question relating to cancer then you would probably use Hallmark. Another question is how do you interpret a process coming up as enriched using one library but the same process not coming up when using another. I know the gene sets are not identical but is there a better way to interpret this?
I am using GSEA to identify what processes are happening in my cells. GO BP is too broad and returns too much stuff to be specific. Currently I'm using Kegg, Hallmark and maybe Reactome, but Reactome also returns a lot of stuff. This is what is ultimately leading me to ask this question. Do you just pull out results that help you tell your story? Seems very arbitrary to me.
I often see in cancer papers H, C2 and C5 sets from MSigDB being tested against, by the way. But in general geneset enrichment test are more of a smoke and mirrors situation rather than hard science and always should be validated in the wet lab. There are lots of discussions on this forum regarding that, but of particular note in my opinion would be this one