Single sample vs. a control group in limma
1
1
Entering edit mode
2.7 years ago
druggable ▴ 60

Hi All,

I was reading a paper that performed differential expression analysis for each sample vs. samples from a control group. Here was the code used:

expresion<-expression[,c(controles,patientes[b])]
control<-colnames(expression)[controles]
case<-colnames(expression)[patientes[b]]
controls<-cbind(paste("Control_",1:length(control),sep=""),rep("control",length(control)))
cases<-cbind(paste("Patient_",1:length(case),sep=""),rep("case",length(case)))
targets<-rbind(controls, cases)
design<-cbind(CONTROL=c(rep(1,length(control)),rep(0,length(case))), CASE=c(rep(0,length(control)),rep(1,length(case))))
rownames(design)<-targets[,1]
cont.matrix<-makeContrasts(CASEvsCONTROL=CASE-CONTROL,levels=design)
fit<-lmFit(expresion,design)  ##getting DEGs from IQR
fit2<-contrasts.fit(fit, cont.matrix)
fit2<-eBayes(fit2)

Here, basically the case/patient is just one sample compared to multiple patients in the control group. This would output a "patient-specific" profile - list of differentially expressed genes for further analysis in a patient-specific manner. Can I apply the same method for my datasets?

Thanks!

expression differential • 979 views
ADD COMMENT
1
Entering edit mode
2.7 years ago

You could, but no one can guarantee that your results would be good, because as far as I'm aware, no one has benchmarked limma in this sort of situation.The assumption being made here is that that variance in the sample from each patient is well estimated by using the inter-patient variance from the controls. This is really two assumptions hiding as one - 1) The variance within a patient is the same as the variance between patients. 2) The variance in patients is the same as the variance in controls.

There is no way of knowing, really, whether either or both of these assumptions are true, without a rigorous study designed to test them (probably involving replicates from within indeviduals), and even then, it might be true in some situations and not others.

ADD COMMENT
0
Entering edit mode

Thanks - really nice explanation. If I read the OP's correctly, s/he is subsetting the expression matrix to retain one patient and all the controls and presumably s/he's looping through the patients one by one in this way. One could partially address your concerns by avoiding to subset the data and have instead a design matrix where each patient is assigned to a distinct group and all controls assigned to the "control" group. Then you set up a contrast matrix comparing each patient vs the controls. In this way the estimates of variation are based on all samples and only the testing is done "1 patient vs all controls". Am I getting it right...?

ADD REPLY
0
Entering edit mode

Hmmmm.... I'm not sure how the variance is estimated in that case. I think normally the variance is estimated for each condition separately. Might need to ask @gordonsmyth's opinion.

ADD REPLY

Login before adding your answer.

Traffic: 1474 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6