Performing differential expression analysis after applying transformations on my data
1
0
Entering edit mode
16 months ago
JACKY ▴ 170

I possess RNA-seq data that's TPM normalized, sourced from different origins. I merged these datasets and then applied log2 transformation followed by batch effect correction. These steps ensured that all samples approximated a similar range, making them crucial for consistency.

While I understand that differential expression analysis is typically done on raw counts, I don't have that data. While Limma's voom approach works for normalized data, is it still applicable after log2 transformation and batch effect correction?

Conducting differential expression analysis on my merged TPM data without these transformations might not yield accurate results due to the discrepancies in some sample values. What's the recommended approach?

normalization differential-expression r • 1.4k views
ADD COMMENT
0
Entering edit mode

While Limma's voom approach works for normalized data, is it still applicable after log2 transformation and batch effect correction?

The recommended approach is to use batch as a covariate, not perform DE on batch-corrected data especially TPM which is inherently incomparable.

ADD REPLY
0
Entering edit mode

I see, so currently my design looks like this (Benefit has two values only):

design = model.matrix(~ 0 + Benefit, meta_for_train)

I'm removing batch effect for cancer type (I have 4 cancer types). You're saying that my design should be like this ?

design = model.matrix(~ 0 + Benefit + Cancer_Type, meta_for_train)
ADD REPLY
0
Entering edit mode

Your batch variable is Cancer Type?

ADD REPLY
0
Entering edit mode

Yes, it's a pan-cancer study

ADD REPLY
0
Entering edit mode

Why are you batch correcting data where the batch is such a critical biological variable? That makes no sense.

ADD REPLY
0
Entering edit mode

Because I want my model to be able to classify response, regardless of cancer type or anatomical location.. I can basically use the cancer type as a predictive feature, you're right this can be an important variable so it's an option.

ADD REPLY
0
Entering edit mode

You seem to have a really nice machine learning background but not a great cancer background. Cancer is heterogeneous even within a specific cancer type, how do you expect your model to classify response just based on horribly mangled generic RNA-seq data? We use multi-omics on highly specific cancer subtypes and our models are not all that amazing, I don't see how removing critical biological information is going to give you anything better than a crapshoot.

ADD REPLY
0
Entering edit mode

Are you suggesting that I should disregard batch correction for the cancer type and instead incorporate it as a predictive feature in my model? Additionally, I have other variables like gender, treatment type, and outputs from various deconvolution algorithms indicating cell abundance for each sample. So I use those cells as predictive features as well.

ADD REPLY
0
Entering edit mode

I don't know machine learning, so I can't speak to "incorporate it as a predictive feature". My point is that treating cancer type as a mere batch variable will result in immense loss of context. Given how narrow your data is, such a broad question will not work in your favor. But I'm no expert on machine learning so you might stumble upon something. I'd recommend you consult some folks that have experience in cancer RNA-seq and make sure you understand what you're expecting from your data.

ADD REPLY
0
Entering edit mode

I see. You're right, we decided not to correct for cancer type, and to use it in the predicting process. Thanks for the help!

ADD REPLY
0
Entering edit mode
16 months ago
bk11 ★ 3.0k

It will be a valid approach to analyze your individual data separately for DE and subsequently perform meta-analysis of your DE results using any of following methods-

  1. https://cran.r-project.org/web/packages/metaRNASeq/vignettes/metaRNASeq.pdf
  2. https://www.bioconductor.org/packages/release/bioc/vignettes/metaSeq/inst/doc/metaSeq.pdf
ADD COMMENT

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6