I am fairly new to processing RNAseq data, my current pipeline feeds counts per million (CPM) into edgeR and perform voom transformation (limma package).
My question is whether this same pipeline can be applied to TPM rnaseq counts?
Apologies, I feed counts into edgeR and then do Voom to get logCPM (limma-trend for differential).
I am wondering if the last steps of this pipleine (limma-trend) can be used for logTPM?
No, neither TPM nor CPM should be fed into edgeR or voom, both expects raw counts, as library sizes are used in the normalization process.
For example, the voom manual states:
Description:
Transform count data to log2-counts per million (logCPM), estimate
the mean-variance relationship and use this to compute appropriate
observation-level weights. The data are then ready for linear
modelling.
Usage:
voom(counts, design = NULL, lib.size = NULL, normalize.method = "none",
block = NULL, correlation = NULL, weights = NULL,
span = 0.5, plot = FALSE, save.plot = FALSE)
Arguments:
counts: a numeric ‘matrix’ containing raw counts, or an
‘ExpressionSet’ containing raw counts, or a ‘DGEList’ object.
Counts must be non-negative and NAs are not permitted.
I think I understand correctly now. I think I will need to log transform the TPM values.
Quick question: Can these logTPM be fed into Limma for differential analysis? This is what I currently do for logCPM (from voom).
Many thanks for your help
The answer is No. You can not use anything related to rpkm, fpkm or tpm for differential expression analysis. Even if they are in raw, normalized or log transformed.
You are giving something other than raw read counts to EdgeR?
Apologies, I feed counts into edgeR and then do Voom to get logCPM (limma-trend for differential). I am wondering if the last steps of this pipleine (limma-trend) can be used for logTPM?