Running Deseq2 with all samples vs samples for each comparison separately
1
0
Entering edit mode
10 months ago

Hello,

I am currently using Deseq2 to perform differential analysis on my data. I have feature count data for 24 samples and 4 comparisons to do ( for each comparison I have 3 control samples and 3 knockout samples).

The issue is: when I run Deseq2 for all of the samples, specifying all 4 comparisons, I get very different foldchanges and p-values then when I split the feature count data into 4 files, one per comparison, and run Deseq2 for each comparison separately.

One thing I noticed is that the p-value distribution peaks at around 0 when running with data for a single comparison, but peaks at around 1 when using all of the samples data ( 4 comparisons).

Why does this cause such a big difference and what is the best practice I should follow for these analysis?

Thank you

Deseq2 DGE • 676 views
ADD COMMENT
0
Entering edit mode
10 months ago

In general, it's preferable to include all your samples in the dds object, for better size normalization and dispersion estimates. But if PCA shows that your sample groups differ widely then sometimes this doesn't make sense.

ADD COMMENT
0
Entering edit mode

Thank you for you reply. The cells from each group differ in both cell type and tissue from which they are collected. And so in the pca, the groups are very far apart. Do you think this could be a factor to explain the huge descrepancy that I am seeing? Do you by chance recommend any literature regarding this topic? Thank you

ADD REPLY
0
Entering edit mode

I would not include samples of different tissue types together. The assumptions underlying library size normalization might be violated.

ADD REPLY

Login before adding your answer.

Traffic: 2552 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6