Question

Single sample vs. a control group in limma

1

Entering edit mode

3.4 years ago

druggable ▴ 60

Hi All,

I was reading a paper that performed differential expression analysis for each sample vs. samples from a control group. Here was the code used:

expresion<-expression[,c(controles,patientes[b])]
control<-colnames(expression)[controles]
case<-colnames(expression)[patientes[b]]
controls<-cbind(paste("Control_",1:length(control),sep=""),rep("control",length(control)))
cases<-cbind(paste("Patient_",1:length(case),sep=""),rep("case",length(case)))
targets<-rbind(controls, cases)
design<-cbind(CONTROL=c(rep(1,length(control)),rep(0,length(case))), CASE=c(rep(0,length(control)),rep(1,length(case))))
rownames(design)<-targets[,1]
cont.matrix<-makeContrasts(CASEvsCONTROL=CASE-CONTROL,levels=design)
fit<-lmFit(expresion,design)  ##getting DEGs from IQR
fit2<-contrasts.fit(fit, cont.matrix)
fit2<-eBayes(fit2)

Here, basically the case/patient is just one sample compared to multiple patients in the control group. This would output a "patient-specific" profile - list of differentially expressed genes for further analysis in a patient-specific manner. Can I apply the same method for my datasets?

Thanks!

expression differential • 1.3k views

ADD COMMENT • link updated 3.4 years ago by i.sudbery 22k • written 3.4 years ago by druggable ▴ 60

score 1 · Answer 1 · 2022-03-17

1

Entering edit mode

3.4 years ago

i.sudbery 22k

You could, but no one can guarantee that your results would be good, because as far as I'm aware, no one has benchmarked limma in this sort of situation.The assumption being made here is that that variance in the sample from each patient is well estimated by using the inter-patient variance from the controls. This is really two assumptions hiding as one - 1) The variance within a patient is the same as the variance between patients. 2) The variance in patients is the same as the variance in controls.

There is no way of knowing, really, whether either or both of these assumptions are true, without a rigorous study designed to test them (probably involving replicates from within indeviduals), and even then, it might be true in some situations and not others.

ADD COMMENT • link 3.4 years ago by i.sudbery 22k

0

Entering edit mode

Thanks - really nice explanation. If I read the OP's correctly, s/he is subsetting the expression matrix to retain one patient and all the controls and presumably s/he's looping through the patients one by one in this way. One could partially address your concerns by avoiding to subset the data and have instead a design matrix where each patient is assigned to a distinct group and all controls assigned to the "control" group. Then you set up a contrast matrix comparing each patient vs the controls. In this way the estimates of variation are based on all samples and only the testing is done "1 patient vs all controls". Am I getting it right...?

ADD REPLY • link 3.4 years ago by dariober 15k

0

Entering edit mode

Hmmmm.... I'm not sure how the variance is estimated in that case. I think normally the variance is estimated for each condition separately. Might need to ask @gordonsmyth's opinion.

ADD REPLY • link 3.4 years ago by i.sudbery 22k