Firstly, let me start by saying that I am relatively new to differential expression analysis so please bear with me. I have read over the Limma user guide countless times, and have looked at previous posts regarding similar questions, however I am not convinced that my design matrix truly captures the intention of my study.
My exploratory study consists of an Affymetrix U133a microarray dataset consisting of eight subjects pre/post sleep deprivation (16 total data points). Two psychological evaluations were administered during these time points (SSS and PVT) which I want to use as continuous response variables to represent patient sleep deprivation. My end goal is to determine which differentially expressed genes respond to sleep deprivation in my paired sample.
Does anyone have any thoughts about whether my design matrix will yield genes only found in responders (Those with higher PVT/SSS scores)? What would be my primary coefficient of interest? Additionally, is there any way that I could remove the effects of gender on my design matrix?
This my data frame which I am using to construct the design matrix:
patient <- factor(rep(c(1,2,3,4,5,6,7,8), each=2)) #patient ID
condition <- factor(rep(c('Post', 'Pre'), 8)) #Pre or Post Treatment
gender <- factor(c(rep('F', 8), rep('M', 8))) #gender
PVT <- c(339.67,254.56,423.33,...) #Response Variable 1
SSS <- c(6,2,3,1,3,2,5,2,5,1,3,2,2,1,5,3) #Response Variable 2
data.frame(patient, condition, gender, PVT, SSS)
patient condition gender PVT SSS
1 1 Post F 339.67 6
2 1 Pre F 254.56 2
3 2 Post F 423.33 3
4 2 Pre F 316.09 1
5 3 Post F 640.13 3
6 3 Pre F 358.82 2
7 4 Post F 321.15 5
8 4 Pre F 491.67 2
9 5 Post M 338.99 5
10 5 Pre M 288.09 1
11 6 Post M 261.96 3
12 6 Pre M 246.69 2
13 7 Post M 276.48 2
14 7 Pre M 250.11 1
15 8 Post M 267.14 5
16 8 Pre M 249.67 3
This is my proposed design matrix:
design <- model.matrix(~patient + condition*PVT+SSS)
Any input would be greatly appreciated. Thank you in advance.
Kevin, thank you so much for your response. It is exactly what I need to move forward in my analysis. You are correct, both PVT and SSS are independent evaluations and therefore can be tested independently. Do you think that stratifying by gender would significantly compromise the strength of the analysis with respect to the designs which include both genders due to the reduced sample size?
I am not sure that it will be problematic to segregate based on gender. The total sample number is already not too high. Is sex / gender a known confounding factor in these types of studies?