Hi,
I am analyzing publicly available microarray data and am using the log2fold data already uploaded on GEO from GSE9776. These are 2 channel experiments with 17 isolates and 6 conditions of which I'm interested in the first 6 isolates with 2 conditions. The conditions are antibiotic at 2 hours ("INH 2hr") and at 6 hours ("INH 6hr"). They all have the same control (water) in Cy3
I created a design matrix with INH2hr as the intercept and INH6hr at first coefficient, with the remaining coefficients being assigned to conditions I'm not interested in. My understanding is I should leave thse other conditions in the calculations at the variance in those samples are important in the calculation.
group <- factor(GSE9776@phenoData@data$source_name_ch1,)
design <- model.matrix(~group)
design
colnames(design) <- c("INH2hr", "INH6hr", "KatG_ko", "INH_nutrient", "INH_O2", "hollow_fbr")
fit_GSE9776 <- lmFit(GSE9776_filtered, design)
Based on my design, INH2hr is going to be the intercept, and each of the other conditions will be assigned a coefficient.
I'd like to look at the differences in gene expression across the following : 1. INH 2hr vs control 2. INH 6hr vs control 3. INH 6hr vs INH 2hr
Based on my reading of the vignette, to obtain the INH6hr vs control I should use coefficient = "INH6hr" . How do I obtain INH6hr vs INH2hr and INH2hr vs ref? Should I be using a contrast matrix? If so, how?
Thanks in advance,
Regards,
Husain
what is your control group from
colnames(design) <- c("INH2hr", "INH6hr", "KatG_ko", "INH_nutrient", "INH_O2", "hollow_fbr")
? Is itINH2hr
?Since they are 2 channel experiments, the control is the other channel. There is a common reference control on all the experiments - that's my source of confusion. 10.1 in the limma vignette recommends treating these as a single channel experiment.
Can you update your
targets
file here