Pairwise comparision in DESeq2
1
0
Entering edit mode
7.7 years ago
EVR ▴ 610

HI,

I have a RNA seq data performed at different time points. So for every time I have 4 samples(Control, Knock-down_1, Knock-down_2, Knockdown_3) and I want to compare every Knock-down samples to its Control samples. As DESeq2 two set of samples to predict the Diff. expressed genes, how the analysis can be carried out:

 a) Includes all samples from this particular time point and later use contrasts function to find the Diff expressed genes between specific samples
                                   OR
 b) Include only the two samples for which you want to find the Diff expressed genes and finish the analysis.

Thanks in advance

RNA-Seq DESeq2 • 2.2k views
ADD COMMENT
2
Entering edit mode
7.7 years ago

The correct answer is "c) Includes all sample from all time points" because it will give you the best gene-level dispersion estimate.

There is a great tutorial here that explain how to do time-course analysis with mutants with DESeq2.

ADD COMMENT
0
Entering edit mode

Thanks for your comment. But I am not comparing the samples of one time point with another time point but samples within the time point so why to include samples of other time points. Wont it influence values of the other samples. For an example, is it worth having the counts of samples from day7 influencing the counts of samples in day1?

ADD REPLY
1
Entering edit mode

Like Carlo said, "..because it will give you the best gene-level dispersion estimate". Fit the model using all of your data and it will give you a better estimate of the mean/variance-trend for any given gene. With this estimate, you can better estimate differences between your experimental arms at any given timepoint than if you were analysing just the samples from that timepoint.

ADD REPLY
0
Entering edit mode

Thank you russhh. I can understand to get a better gene-level estimate, it is better to use all samples from all time points. But I still cant understand.For an example, wont the actual real expression(raw counts) of gene x at 3 hours gets affected by its actual expression(raw counts) of same gene x at day7?

ADD REPLY
1
Entering edit mode

The expression will be unaffected if you take all time points, but you will be more accurate when assessing the significance of a difference in expression.

That is, as long as your model take into account the time and the interaction between the time and the strain . If you don't consider the time in the model, then your time points will be seen as replicates and the "expression" would be affected.

ADD REPLY
1
Entering edit mode

Admittedly the counts at day3 and day7 will be statistically/biologically dependent. But, how to account for that dependence is not the question that you originally posed. Data from any quantitative experiment can be viewed as comprising signal and noise. You'd hope that although there may be dependence between the fitted values for your different samples, the noise should be uncorrelated between those samples. And it's your ability to estimate the amount of noise that is improved when you include all of your different timepoints.

ADD REPLY

Login before adding your answer.

Traffic: 2551 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6