Does DESeq2 and EdgeR account for the increased total RNA production in one of the experimental group
2
1
Entering edit mode
3.7 years ago
cwwong13 ▴ 40

I know both DESeq2 and EdgeR normalization schemes have accounted for the RNA composition among the samples. Recently, I revisited the seminal paper by Richard A. Young's group citing the necessity of spike-in control for normalization. While I know date back in 2012, more people used the PKFM or similar methods that account for only library depth during analysis, I wonder whether the issue raised by the paper has been resolved by the DESeq2 or EdgeR algorithms? I heard that both algorithms assume there only a small portion of RNA is significantly perturbed. Is that true? Are there any algorithms to computationally normalize the RNA composition similar to the phenomenon saw in Young's paper (cMyc induced overexpression of virtually all genes)?

Thanks!

RNA-Seq DESeq2 EdgeR • 1.6k views
ADD COMMENT
1
Entering edit mode

See this Nature (rebuttal) paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4110711/ arguing why (from a biological standpoint) the DESeq method should still be used when there's global transcriptional amplification by c-Myc.

ADD REPLY
3
Entering edit mode
3.7 years ago

The standard normalization in both edgeR and DESeq2 assume an average fold change of zero. Inherently RNAseq is a relative approach - it tells you what % of RNA belongs to any given transcript within a cell. It is thus not possible to detect concerted changes applying equally across all genes. In the case of perfect global regulation, it is a mathematical impossibility to detect changes. Where the changes are not completely global there might be some things you can do.

For example DESeq2 allows to you nominate a set of genes you believe havn't changed, it will base its normalization of those.

There are some papers on this and attempts to deal with (non-perfect) global shifts:

https://academic.oup.com/biostatistics/article/19/2/185/3949169#113055100

https://genomebiology.biomedcentral.com/articles/10.1186/s13059-015-0679-0

EDIT (See comment below): EdgeR assumes that EITHER the median log fold change is zero OR most genes are not DE, and can handle upto ~15% deviation from this assumption on the default normalisation or 20% on alternate normalisation.

ADD COMMENT
4
Entering edit mode

I can't speak for DESeq2, but edgeR does not make any assumption about the average fold change. edgeR can, for example, accommodate unbalanced DE whereby 15% of genes are up-regulated while none are down-regulated. TMM works on the log-fold-change scale and computes robust means rather than arithmetic means. The assumptions of TMM are that either

  1. most (~80%) of genes are not DE or
  2. the differential expression is balanced

Only one of these conditions needs to be satisfied rather than both. As a very rough rule, TMM should work reasonably well if abs( %up - %down) is less than about 15%, where %up and %down are the global proportions of up and down DE genes.

We showed long ago (Oshlack et al 2007) that loess normalization can tolerate up to 20% unbalanced DE well but starts to break down at the 25% point. TMM is not quite so aggressive as loess but makes the same sort of assumptions.

Reference

Oshlack, A., Emslie, D., Corcoran, L., and Smyth, G. K. (2007). Normalization of boutique two-color microarrays with a high proportion of differentially expressed probes. Genome Biology 8, R2. https://pubmed.ncbi.nlm.nih.gov/17204140/

ADD REPLY
0
Entering edit mode

I apologize - I was indeed relaying conversations I'd had with Mike Love, so that presumably applies to DESeq rather than edgeR. Funny thing was, before my conversation with Mike, I had known about the "most genes are not DE" assumption, but he told me that DESeq doesn't have this assumption!

ADD REPLY

Login before adding your answer.

Traffic: 2149 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6