Question

RNASeq differential expression masked by pathways disregulation

0

Entering edit mode

22 months ago

Gama313 ▴ 130

I am working on a 20 sample dataset. I need to isentify De genes. 10 samples were collected in the collection center (CC) 1 and 10 samples in the CC2. Each group has 5 samples with both condition1 and condition2. Unfortunately, when I tested DE CC1 vs CC2 only, there are several survival pathways upregulated in CC2 samples. This is due to the fact that CC2 send us samples later respect to CC1 (in terms of days). When I calculated DE among condition1 vs condition2 (using CC as covariate) I cannot observe DE genes (FDR 0.05). From PCA I see that differences in CC are far stronger than condition1 vs 2. To my knowledge, both Combat and remoBatch (limma) destroy biologic variability so I am not confident of using them.

My question is: what to do in this setup?

normalization batch RNA-seq • 1.1k views

ADD COMMENT • link updated 4 months ago by Ram 44k • written 22 months ago by Gama313 ▴ 130

score 0 · Answer 1 · 2023-02-02

0

Entering edit mode

22 months ago

ATpoint 85k

Tools like DESeq2, edgeR and limma recommend in their manuals to include covariates such as batch (here that is center) into the model. Try that and see how it goes. If you want more feedbacks please show code, data and plots.

ADD COMMENT • link 22 months ago by ATpoint 85k

0

Entering edit mode

Thanks for the answer. However I am not sure this could be considered batch since CC2 samples have true upregulation of specific pathways that could (in principle) buffer variation when condition 1 vs 2 are tested.

ADD REPLY • link 22 months ago by Gama313 ▴ 130

0

Entering edit mode

If this is the case then you cannot correct for anything as this effect is nested with center.

ADD REPLY • link 22 months ago by ATpoint 85k

0

Entering edit mode

It you really do have a reason to suspect that the biology of samples from CC1 is different from the biology of samples from CC2, and that CC1 is the correct biology and CC2 the wrong biology, then their may be an arguement from discarding the samples from CC2 and just conducting the analysis on the samples from CC1.

ADD REPLY • link 22 months ago by i.sudbery 20k

0

Entering edit mode

That's exactly what I've done. However the dispersion is really high (primary samples) and the total number of samples seems really low to obtain results with a good confidence.

ADD REPLY • link 22 months ago by Gama313 ▴ 130