Question

normalizing multiple conditions together or not

0

Entering edit mode

8 months ago

markus.glass ▴ 40

Hi everyone,

I've got some problems understanding/deciding if I should normalize samples from multiple conditions together or not.

The situation is as follows: I've got 9 RNA-seq libraries from 3 conditions, a, b and c (each in triplicates). I'm interested in differential gene expression for a vs b and a vs c.

Usually, I normalize all samples together using edgeR and the TMM method, then apply the exactTest(a,b)/exactTest(a,c). Thus, I get the same normalized expression values from both comparisions that I can use also for, e.g., cluster analyses. Furthermore, I thought, that variances are better estimated using more samples.

However, FDR (BH) adjusted p-values tend to be far worse, when normalizing all samples together, although for DE-testing, I just use samples from the respective 2 groups for a pairwise test.

Could anyone briefly explain why this is the case or if I made a logical mistake here? Thanks in advance!

normalization edgeR • 658 views

ADD COMMENT • link 8 months ago by markus.glass ▴ 40

0

Entering edit mode

Can we just get a bit more information on the nature of the data?

Are there different tissues in the data set?
Is the data coming from different labs?
Are there batch effects or other sources of unwanted variation?

From what you have given so far, this might help: within/across sample/dataset normalisation. But, you may already know it all (it helped me). If you want to compare the absolute values between 2 samples ...you need across sample normalisation i.e. tmm (edgeR), vst/rlog (deseq2).

ADD REPLY • link 8 months ago by BioinfGuru ★ 2.1k

0

Entering edit mode

Hi BioinfGuru

regarding your questions:

no different tissues
all from the same lab
no obvious batch effects, all samples prepared and sequenced together

My questions is, why do adjusted p-values differ so much, when I normalize all 9 samples together and then test, e.g., a vs b in comparison to normalizing only the 6 a and b samples together and then perform the DE-test? And which strategy should be used?

ADD REPLY • link 8 months ago by markus.glass ▴ 40