Entering edit mode
2.3 years ago
camelest
▴
50
Hi, I have a question on edgeR cpm function. If we follow the user manual, the standard workflow should be
x <- read.delim("TableOfCounts.txt",row.names="Symbol")
group <- factor(c(1,1,2,2))
y <- DGEList(counts=x,group=group)
keep <- filterByExpr(y)
y <- y[keep,,keep.lib.sizes=FALSE]
y <- calcNormFactors(y) *1
design <- model.matrix(~group)
y <- estimateDisp(y,design) *2
When we want to get the count table of log2cpm by cpm() function, for example, for clustering or heatmap, as in
logcpm <- cpm(y, log=TRUE)
should it happen at point 1 just after calculating NormFactors or at point 2 after estimation of dispersion? Is it OK as long as it is after calcNormFactors()?
Thank you so much for the clarification.