Forum:Methodological problems are extremely common for enrichment analysis - beware the pitfalls before you publish
3
9
Entering edit mode
2.7 years ago
mark.ziemann ★ 1.9k

In my recent expereince peer reviewing I've noticed that enrichment analysis tends to suffer from problems in the statistics and reporting of what was actually done. So together with Kaumadi Wijesooriya and my team, we analysed a whole bunch of published articles to see how common these problems are.

The article was published online this week and results were pretty staggering - less than 20% of articles were free of statistical problems, and very few articles described their method in such detail that it could be independently repeated.

So please be aware of these issues when you're using enrichment tools like DAVID, KOBAS, etc, as these pitfalls could lead to unreliable results.

GSEA DAVID enrichment-analysis -geneontology • 5.4k views
ADD COMMENT
1
Entering edit mode

Looks great. I've been banging on about this for years!

ADD REPLY
1
Entering edit mode

Just adding this discussion from Gordon Smyth over at Bioconductor from this week (GSEA-related).

https://support.bioconductor.org/p/9142651/#9142680

ADD REPLY
0
Entering edit mode

We find that 95% of analyses using over-representation tests did not implement an appropriate background gene list or did not describe this in the methods

Scary... It's hard to believe that (up to) 95% of the researchers using an enrichment method were ignorant of the best practice regarding background correction... this somewhat looks like a "I used the tool/parameters which gave me results" issue.

ADD REPLY
0
Entering edit mode

Thanks for sharing -- it was a good read.

ADD REPLY
3
Entering edit mode
2.7 years ago
dsull ★ 6.9k

"In the case of ORA for differential expression (eg: RNA-seq), a whole genome background is inappropriate because in any tissue, most genes are not expressed and therefore have no chance of being classified as DEGs. A good rule of thumb is to use a background gene list consisting of genes detected in the assay at a level where they have a chance of being classified as DEG"

This depends. I frequently use whole genome background and it is appropriate in many (albeit not all) cases. If you're doing a differential gene expression RNA-seq (or other type of whole-genome assay) study and are seeing one tissue type differentiating into another tissue type, I'd argue to use a whole-genome background. In this case, all genes have the capacity to be detected and if my, say, embryonic, tissue is differentiating into, say, kidney tissue, I'd want my pathway results to show that. If my background were just the union of genes that have TPM >= 1 in any samples, I may not see this effect because my background will contain genes expressed in either tissue type. This is especially true in tumorigenesis or cancer therapy studies where almost anything can happen.

On the other hand, if you're working on liver tissue and just making a small perturbation that affects the expression of a few liver enzymes, then yes, you'd want a "liver tissue" background otherwise all your enriched pathways are going to be liver pathways (since your "list of genes that could potentially change" is completely biased towards liver-specific genes whereas you should actually mostly be interested in what metabolic pathways do those enzymes belong to). With whole genome background, your results aren't wrong per se (indeed, liver pathways are enriched when you perturb the expression of liver genes -- duh!), they're just not what you're looking for.

It's very situation-dependent and the choice of background is a lot trickier than one might think: it involves thinking carefully about your biological question, what exactly you're looking for, what you want your null model to be, how you want to interpret your results, etc.

ADD COMMENT
1
Entering edit mode
2.7 years ago

Great paper! We need to pay more attention to these things. Indeed, I may have been guilty of not describing the background used myself in the past (although I do always use one).

One thing that isn't mentioned here is correction for known bias'. For example gene length bias in RNA-seq and variant burden analysis. If you are doing enrichment analysis, then you really should be using a method that corrects for gene length, like GO-seq. Another big bias is expression level - backgrounds in differential expression analyses arn't as simple as expressed or not - different genes have different power to be detected DE by nature of their expression level. Ever noticed who terms like "RNA metabolism" seem to come up in so many analyses? Its because these genes are so highly expressed.

On the other hand, FDR correction is not simple in enrichment analysis, and just applying BH seems to be over conservative. This is because it assumes that gensets are independent, and, at least in GO they are anything but. The nested hierarchical structure of GO makes a complete mockery of independence assumption in most FDR correction. Not that not correcting is really a solution either.

In the end because of the reasons in the article, and the ones above, I never draw conclusions from enrichment analysis, only generate hypotheses to be tested, and tend to regard conclusions drawn solely from GO analysis in the work of others with extreme skepticism, unless validated by a downstream experiment.

ADD COMMENT
0
Entering edit mode

Let's also not forget about the fact that genesets contributed to MSigDB are most likely generated using vastly different filtering/data-QC as compared to your own methods. Comparability is, thus, limited at least in this fashion. It's all smoke and mirrors at this point :(

ADD REPLY
1
Entering edit mode
2.7 years ago

I believe that there are two interconnected issues that feed on one another. Similar to what i.sudbery and dsull mention but I'll reformulate to be clearer and I believe the problem is even more serious.

The first uncomfortable truth is that statistical methods cannot be relied upon in the first place. Classical statistics is not appropriate for enrichment studies. The known information deposited as ontology annotations are not a result of random sampling or natural processes - instead, it was driven by what scientists found interesting and "publishable". Just because we have a lot of data does not mean it is fair data. It is akin to wanting to create a fair die throw but all we have are a whole bunch of systematically unfair die where we don't know in what direction the unfairness goes.

Basically, it is about how wrong is still tolerable to you. This then ties into the second problem: there seems to be little to no penalty for doing an incorrect enrichment analysis. The person sticking to accuracy hamstrings themselves. So the "beware before you publish" is more of wishful thinking of how it should be but isn't.

i.sudbery has pointed out the correct course of action, one should only generate hypotheses that they can then test again - basically, ensuring that the only person they can fool is themselves - but of course that costs twice (or more) as much

ADD COMMENT
0
Entering edit mode

Often, with a good handle on the biology and the question at hand, one can devise tests that are way cheaper (in both time and money), than the original experiment. For example - find that genes activated by long term exposure to a drug activates the Wnt pathway, hypothesise that this might be what is leading to resistance. Treat cells with a Wnt inhibitor - treated cells die, but untreated ones don't - takes at most a week, costs little to nothing, job's a good 'un.

ADD REPLY
0
Entering edit mode

that is a really good point that I forgot about, checking/validating is much cheaper/faster than de-novo discovery

ADD REPLY
0
Entering edit mode

Basically, it is about how wrong is still tolerable to you. This then ties into the second problem: there seems to be little to no penalty for doing an incorrect enrichment analysis. The person sticking to accuracy hamstrings themselves. So the "beware before you publish" is more of wishful thinking of how it should be but isn't.

That is probably the best

ADD REPLY

Login before adding your answer.

Traffic: 1759 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6