Hi,
I have an experiment where I have treated a cell line with two drugs A and B, their combination AB, as well as control C. So I have 4 treatments but also 3 time point time=3,9 and 24 hours. At each condition and time point i.e. A_3 I have 3 treatments. My goal is a dynamic analysis as such at each time point (3,9,24) and treatment (A,B,AB) I want to find DE genes wrt to control C at that time point.
I have two options:
(1) Run all of the voom-limma (including calcnorm factors and mean variance trend) pipeline separately for each comparision. i.e (A_3 -C_3 etc.) and get DE genes.
(2) Combine all the data into one matrix and run voom-limma pipeline on the whole matrix once then run contrast.fit for each comparison.
When I try to run both ways and compare the results the correlation between t_score (column 3 of toptable output) is 0.98. However, the magnitude varies significantly same gene has a tscore of 39 in (2) vs 25 (1), such are the p.values. Rank wise they look ok but effect wise they are different. I am not quite sure whether I gain power by applying (2) or lose power. Or which way is preferable. Any help is appreciated.
Thanks
You have mentioned some important points. The rank does not change much but the effective size has changed. That's because by pooling all of the data you increase the degree of freedoms for fitting linear model and make variance estimation for shrinkage more accurate.
Thanks for the reply. My intuition was the same but what consufed me more is the t-statistics in case (2) are much higher than in (1) when I expected the opposite. Whats your intuition on that?
It is hard to say because increasing n does not necessarily lead to decrease of moderate t-statistics.
Hello ea1402!
It appears that your post has been cross-posted to another site: https://support.bioconductor.org/p/88905/
This is typically not recommended as it runs the risk of annoying people in both communities.