Question

RNAseq normalization for new data

0

Entering edit mode

5.5 years ago

sosal ▴ 10

Hello biostars!

I want to implement a model that predicts disease from RNA-seq data. Therefore, the training data were normalized by limma voom, and trained by randomForest.

However, new input data has not been normalized in the same way as training data. If I have to normalized the new data with training data, I need to create a new prediction model with new normalized data.

Please, let me know how to normalize for new data in the same way with training data for fixed model prediction.

RNAseq_train # Training data

RNAseq_test # Test data (but can not be combined with RNAseq_train samples)

DesignMatrix <- model.matrix(~0 + Design)
keep=rowSums(cpm(RNAseq_train)>=1)>=keep_n
RNAseq_train_filtered=RNAseq_train[keep,]

DGE=DGEList(RNAseq_train_filtered)
DGE=calcNormFactors(DGE,method =c("TMM"))
RNAseq_train_normalized=voom(DGE,DesignMatrix, plot=F)$E

but how 'RNAseq_test' normalized in the same way with RNAseq_train_normalized? (In the situation that RNAseq_test was new cohort data after training model)

normalization RNA-Seq voom • 1.1k views

ADD COMMENT • link updated 17 months ago by Ram 44k • written 5.5 years ago by sosal ▴ 10

0

Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. You can use backticks for inline code (`text` becomes text), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.
code_formatting

ADD REPLY • link 5.5 years ago by Ram 44k