Hi,
I have data from 3 different conditions (normal, breast cancer, endometrial cancer) and different cell types (cd163_positive or negative) for each condition. I am interested in differentially expressed genes between conditions (e.g Breast_cancer vs Endometrial_cancer) but also between cell populations within the same condition but also across conditions (e.g Breast_cancercd163_positive-Breast_cancercd163_negative). Below is the pheno table and the code for making the contrasts.
- I am not sure how to make the (simple) contrasts : Breast_cancer vs Endometrial_cancer , Breast_cancer vs Normal, Endometrial_cancer vs Normal using the model below.
- Is the model below correct for the population comparisons ?
pData
Condition Population
Normal cd163_positive
Normal cd163_negative
Normal cd163_positive
Normal cd163_negative
Normal cd163_positive
Breast_cancer cd163_positive
Breast_cancer cd163_negative
Breast_cancer cd163_positive
Breast_cancer cd163_negative
Breast_cancer cd163_positive
Endometrial_cancer cd163_positive
Endometrial_cancer cd163_negative
Endometrial_cancer cd163_positive
Endometrial_cancer cd163_negative
Endometrial_cancer cd163_positive
Endometrial_cancer cd163_negative
f <- paste(pData$Condition,pData$:Population,sep="")
f <- factor(f)
design <- model.matrix(~0+f)
colnames(design) <- levels(f)
> colnames(design)
[1] "Breast_cancercd163_negative" "Breast_cancercd163_positive" "Endometrial_cancercd163_negative" "Endometrial_cancercd163_positive"
[5] "Normalcd163_negative" "Normalcd163_positive"
cont.matrix = makeContrasts(
BRC_posVsBRC_neg = Breast_cancercd163_positive-Breast_cancercd163_negative,
Norm_posVsNorm_neg = Normalcd163_positive-Normalcd163_negative,
END_posVsEND_neg = Endometrial_cancercd163_positive-Endometrial_cancercd163_negative,
BRC_posVsNorm_pos = Breast_cancercd163_positive-Normalcd163_positive,
BRC_posVsEND_pos = Breast_cancercd163_positive-Endometrial_cancercd163_positive,
END_posVsNorm_pos = Endometrial_cancercd163_positive-Normalcd163_positive,
BRC_negVsNorm_neg = Breast_cancercd163_negative-Normalcd163_negative,
BRC_negVsEND_neg = Breast_cancercd163_negative-Endometrial_cancercd163_negative,
END_negVsNorm_neg = Endometrial_cancercd163_negative-Normalcd163_negative,
levels=design)
v=voomWithQualityWeights(d, design = design,plot = TRUE)
vfit <- lmFit(v, design)
vfit <- contrasts.fit(vfit, contrasts=cont.matrix)
efit <- eBayes(vfit)
I know this is fairly simple but I am a bit confused. Thank you very much!
Thanks a lot Kevin!!
Is that okay then?
Yes I think the analysis seems fine, but my problem is still how to compare between conditions, regardless of the population column. Would it be ok to create another design and contrast matrix that I look for DEGs between i.e. Breast_cancer vs Normal?
Thanks a lot! Matina
Yes, that would also be fine, but then this says nothing about CD163, of course. Also, after you derive the statistics for the
Breast_cancer-Endometrial_cancer
comparison, you will have to filter the list a 2nd and 3rd time to only include genes that are also significant inBreast_cancer-Normal
andEndometrial_cancer-Normal
There are many ways of doing these differential expression analyses, as you can see. The important thing is in interpreting the results based on how you conducted the analysis.
You can also 'adjust' for covariates (for example CD163) sing Limma. For more information on that, see Gordon's answer here: Question: Removing continuous covariate effects in limma analysis
Thank you Kevin,
The idea is that i dont see that many differences between the CD163 subpopulations and that is why i want to look only at the comparisons between conditions and rather at populations. I have never thought that it is necessary to filter the gene list for the
Breast_cancer-Endometrial_cancer
comparison. Do you mean to double check that for example the significantly up-regulated genes betweenBreast_cancer-Endometrial_cancer
are also significantly different between theBreast_cancer-Normal
comparison?I really appreciate taking the time to reply to me. Thanks a lot!
Yes, my idea regarding
Breast_cancer-Endometrial_cancer
andBreast_cancer-Normal
relates to a comment from a reviewer for a recent manuscript of mine (accepted for publication), which was based on the endometrial cancer TCGA data. The comment/question, applied to your situation, would be something like this:If you find a gene that is statistically significantly different between breast and endometrial cancer, how can you be sure that it has any relation to cancer without also checking it's difference between breast cancer and normal?
I guess that it's just something about which you should be aware! - it may no exactly be relevant to your end goal.