Entering edit mode
4.2 years ago
far.zi
▴
10
Hi, I've beed doing RNA-Seq analysis using R. However, when I get the final edgeR_result file, the logCPM values are very high which is not true for the genes I'm studying since they are extremely low expressed. I've checked my count table many times and no mistake there. Can someone help me please to figure it out why I'm getting those high values? Thanks I also put the script I use in R in case needed:
cts <- read.csv("normCounts.txt", sep = '\t', row.names = 1, header = FALSE)
group <- (c(1,1,1,1,2,2,2,2))
cts <- DGEList(cts, group)
#cts <- calcNormFactors(y)
design <- model.matrix(~group)
cts <- estimateDisp(cts, design)
fit <- glmQLFit(cts, design)
qlf <- glmQLFTest(fit, coef = 2)
write.table(qlf$table, file = "edgeR_result.txt")
res <- qlf$table
ggplot(res, aes(x=logCPM, y=logFC, color=PValue < 0.05)) +
geom_point() + geom_rug(sides = "l") + theme_minimal()
fit2 <- glmFit(cts, design)
Are you feeding normalized counts into edgeR? Is this correct that you
#
-ed the calcNormFactors? Please add the plot you created.Yes, I am feeding the normalized values.
That is not correct. Please read the manual. You must start from raw counts. There are also very few points in this plot. How many in total?
OMG, I thought I need the norm counts. The total is 478.
That is a very low gene number, so this is no normal RNA-seq, right? Yes, raw counts, I suggest you go through the edgeR manual first before continuing.
Those are "Mirtron" reads. I'll also check the manual again. Thank you :)