Hello biostars!
I want to implement a model that predicts disease from RNA-seq data. Therefore, the training data were normalized by limma voom, and trained by randomForest.
However, new input data has not been normalized in the same way as training data. If I have to normalized the new data with training data, I need to create a new prediction model with new normalized data.
Please, let me know how to normalize for new data in the same way with training data for fixed model prediction.
RNAseq_train # Training data
RNAseq_test # Test data (but can not be combined with RNAseq_train samples)
DesignMatrix <- model.matrix(~0 + Design)
keep=rowSums(cpm(RNAseq_train)>=1)>=keep_n
RNAseq_train_filtered=RNAseq_train[keep,]
DGE=DGEList(RNAseq_train_filtered)
DGE=calcNormFactors(DGE,method =c("TMM"))
RNAseq_train_normalized=voom(DGE,DesignMatrix, plot=F)$E
but how 'RNAseq_test' normalized in the same way with RNAseq_train_normalized? (In the situation that RNAseq_test was new cohort data after training model)
Please use the formatting bar (especially the
code
option) to present your post better. You can use backticks for inline code (`text` becomestext
), or select a chunk of text and use the highlighted button to format it as a code block. I've done it for you this time.