I want to use removeBatchEffect
from limma
in my RNA data but I don't know how the input has to be.
I know that I need log-expression values but I don't know what type of normalization is accepted.
removeBatchEffect(x, batch=NULL, batch2=NULL, covariates=NULL, design=matrix(1,ncol(x),1), ...)
According to the arguments...
x has to be numeric matrix, or any data object that can be processed by getEAWP containing log-expression values for a series of samples. Rows correspond to probes and columns to samples.
If we go to getEAWP
we see that:
getEAWP(object)
0bject: any matrix-like object containing log-expression values. Can be an object of class MAList, EList, marrayNorm, PLMset, vsn, or any class inheriting from ExpressionSet, or any object that can be coerced to a numeric matrix.
Object should be a normalized data object. getEAWP will return an error if object is a non-normalized data object such as RGList or EListRaw, because these do not contain log-expression values.
It doesn't say anything about using a particular normalization.... So... Could it be possible to work with log2(TPM counts + 1)?
Thanks very much in advance
I would input depending on what methods you are using for the differential expression pipeline and dowsntream analyses? edgeR? limma-voom? DESeq2?, all these have different methods to provide values in the log scale e.g. logCPM(), cpm(), vst() or rlog()... (see for example this answer)
Thanks very much for your kind reply.
For downstream analyses I would like to use log2(TPM + 1), so I suppose that I could work directly with the result given by
removeBatchEffect
if I use the same log2 normalized method (log2(TPM + 1) as my input (?)I just wanted to be sure, because as it didn't say (and I haven't seen anything) about the type of normalization... I was wondering if someone knows if it needs a particular normalization.