Hello,
I am working with RNAseq analysis.. and I would like to get confirm for my model design before doing the analysis.
I have 8 samples from 4 patients (each patient has two samples before treatment and after treatment.. ) I would like to find DEG genes which are differentially expressed by treatment.. However, I suspect that it also be affected by sex factor. so I would like to consider the sex factor (but among 4 patient, I only have one male, three female)
Here is the exact description for data model.
sample1 patient1 pre_treatment female
sample2 patient1 post_treatment female
sample3 patient2 pre_treamment male
sample4 patient2 post_treament male
sample5 patient3 pre_treamment female
sample6 patient3 post_treament female
sample7 patient4 pre_treamment female
sample8 patient4 post_treament female
df_enzyme <- factor(rep(c("pre", "post"), len=8))
df_patient <- rep(1:4, each=2, len=8)
df_sex <- factor(c("F","F","M","M","F","F","F","F"))
df_enzyme <- relevel(df_enzyme, ref="pre")
df_meta <- data.frame(row.names=colnames(data), patient = df_patient, enzyme = df_enzyme, sex = df_sex)
Since I don't have replication, for the dispersion estimation, I used blind..
d_fabry <- estimateDispersions(d_fabry, method="blind", sharingMode="fit-only")
I made the following model.
d_fit0a = fitNbinomGLMs(d_fabry, count ~ patient)
d_fit0b = fitNbinomGLMs (d_fabry, count ~ enzyme)
d_fit0c = fitNbinomGLMs (d_fabry, count ~ sex)
d_fit1a = fitNbinomGLMs(d_fabry, count ~ patient + enzyme)
d_fit1b = fitNbinomGLMs(d_fabry, count ~ patient + sex)
d_fit2 = fitNbinomGLMs(d_fabry, count ~ patient + enzyme + sex)
d_fit3 = fitNbinomGLMs(d_fabry, count ~ patient + enzyme + sex + enzyme:sex
for the treatment(enzyme) effect, which one is right?? maybe the first one?? ( it assume that sex would have effect.. right?)
dh_pvalsGLM = nbinomGLMTest( dh_fit2, dh_fit1b )
or
dh_pvalsGLM = nbinomGLMTest( dh_fit1a, dh_fit0a)
for the treatment and sex interaction effct, (since I suspect that depending on different sex, effect of treatment might be different or not), the following modle is right??
dh_pvalsGLM = nbinomGLMTest( dh_fit3, dh_fit2 )
Again, I would like to find DEG which are responsive to treament(enzyme), but I suspect that depending on sex, the responsiveness might be different (maybe not) I also, test with this assumption. Does my model do the right thing for this??
Thank you for your helps. if then (as you say, gender is consumed by the patient effect), is there any way to see if there is any different reaction depending on gender?? As you suggest, I can test with
~patient+ enzyme
(full) and~ patient
(reduced) to see the effect of enzyme.. ( I mean the DEG which are responsive to the enzyme.)But, I also would like to see whether there is any different reaction over gender? (I know I only have one female and three male..) do you think is it possible? or I can only test with enzyme??
I would say, no, there's no good way with the current dataset to look at gender. If you had a couple more male samples then perhaps you could get some meaningful results. Lacking that, don't waste your time.
Oh.. Devon
I have two more quick questions,
Q1) I will have one more male patient. So total of two male, three female.
Do you think in this case, it would be okay ( although not enough), to test with gender??
The reason I am asking this is because, this specific disease are x-linked.. so there should be a difference depending on gender... So I would like to consider the gender as factor even though I could not test it.
Then, do you think
Q2)
case 1)
full mode : ~ patient + gender + enzyme
reduced model : ~ patient + gender
case 2)
full model : ~ patient + enzyme
reduced model : ~ patient
do you think case 1 and case 2 is same?? or still case 1 is meaningful if I would like to consider the fact that disease is x-linked??
You could try, though with the numbers of patients you have you'll be very limited unless the effect is VERY large (human data is noisy).
When you want to look at gender while keeping a patient effect, you have to remember that the patient effect will need to be recoded. Otherwise, the gender and patient effects will be completely confounded. So, you'll end up needing to label all of the patients 1-3 and then give each of them a gender such that there's a male patient 1 and a female patient 1 (and so forth). That will produce a full rank model matrix (well, you'll have to remove a column since you have unbalanced groups, but you get the idea).
Thank you for your response.
If I understand your point correctly, you suggest that I need to make a new column "patient-gender" instead of patient or gender(remove patient and gender column). Is it right? So something like this..
full model : ~ patient-gender + enzyme
reduced model : ~ patient-gender
Is it right? Could you please confirm for this one more time? I really appreciate your comments!!
More like this:
and so one. There's an example where they do this in either the limma or edgeR user's guide (I don't recall which). If you didn't do something like this, the patient effect would suck up too many of the degrees of freedom.