Entering edit mode
13 months ago
Manuel
▴
10
Hello,
Is there any way to instead of doing differential gene expression between groups of interest do between patients?
Let's say i have a subgroup of patients of a disease called LMS/NOS composed by 102 patients having high heterogeneity. I want to do DGE between patients to possibly discover subgroups based on gene expression.
Starting from counts data what pipeline can i use to normalize between patients?
design <- cbind(disease_ddlps,disease_lms)
cont.matrix <- makeContrasts(disease_lms-disease_ddlps, levels=design)
Voom <- voom(RNA_data, design, plot = FALSE,normalize.method = "quantile")
Limma Voom makes the contrasts from groups. I want to ignore the groups and normalize between patients.
Best Regards, Manuel Sokolov Ravasqueira
With a large hererogenous cohort you'll generally want to take a dimension reduction and/or clustering approach. This could be on the form of e.g. PCA, WGCNA, NMF, autoencoders, etc. Just find some papers that do a similar analysis and see which ones mirror what you want to do closest.
Hi rpolicastro! But shouldn't all these methods require that the RNA data is normalized? How can I normalize data without considering the design matrix?
There are numerous ways to preprocess or normalize the counts and reduce/remove batch effect.
removeBatchEffect
from limma andComBat_seq
from sva are two examples for batch effect removal. For normalizationrlog
from DESeq2 is often sufficient.Also, depending on the severity of batch effects some of the aforementioned dimension reduction methods may capture the batches in one of the reduced dimensions, which can then be subsequently ignored for downstream analysis.
If you read papers that employ these methods you'll see the various ways people handle normalization and batch effects.