Entering edit mode
7.0 years ago
Tania
▴
180
Hello Everyone
I am new to edgeR. I am expecting weird gene expressions. I want to double check my code steps before discovering where else the problem could be. Do these steps make sense to you?
txi <- tximport(files, type = "salmon", tx2gene = tx2gene,ignoreTxVersion = TRUE)
write.csv(txi$counts, "counts.csv")
cts <- txi$counts
First 10 samples are control second 10 samples are tumor:
group = factor(c(rep("Control", 10), rep("Tumor",10)) )
dge = DGEList(counts=cts, genes= rownames(data), group=group)
countsPerMillion <- cpm(dge)
summary(countsPerMillion)
countCheck <- countsPerMillion > 1
summary(countCheck)
keep <- which(rowSums(countCheck) >= 2)
dge <- dge[keep,]
summary(cpm(dge))
dge <- calcNormFactors(dge, method="TMM")
dge <- estimateCommonDisp(dge)
dge <- estimateTagwiseDisp(dge)
dge <- estimateTrendedDisp(dge)
et <- exactTest(dge, pair=c("Control", "Tumor"))
etp <- topTags(et, n=100000)
write.csv(etp$table, "dge.csv")
Thanks
So are you getting normal results instead :)
On a serious note, what is unexpected about the results you are getting?
=D Sorry I mean found weird results :) what I expected to be enriched in tumor are enriched in control. I want to make sure code is right before investigating more.
Why didn't you just use
estimateDisp
? It estimates Common, tagwise and trended dispersions in one run :D, also I would change the value of topTags, since the default value seems to be 1, you would really be getting a list of the first 100,000 genes sorted by pvalue.Ok, Thanks biofalconch :) Is this line correct?
It seems alright to me, it really depends on the counts matrix though. A PCA might be useful in this case