I'm using edgeR
in order to perform differential expression analysis from RNA-seq experiment.
I have 6 samples of tumor cell, same tumor and same treatment: 3 patient with good prognosis and 3 patient with bad prognosis. I want to compare the gene expression among the two groups.
I ran the edgeR
pakage like follow:
x <- read.delim("my_reads_count.txt", row.names="GENE")
group <- factor(c(1,1,1,2,2,2))
y <- DGEList(counts=x,group=group)
y <- calcNormFactors(y)
y <- estimateCommonDisp(y)
y <- estimateTagwiseDisp(y)
et <- exactTest(y)
I obtained a very odd results: in some cases I had a very low p-value and FDR but looking at the raw data it is obvious that the difference between the two groups can't be significant.
This is an example for my_reads_count.txt
:
GENE sample1_1 sample1_2 sample1_3 sample2_1 sample2_2 sample2_3
ENSG00000198842 0 3 2 2 6666 3
ENSG00000257017 3 3 25 2002 29080 4
And for my_edgeR_resulta.txt
:
GENE logFC logCPM PValue FDR
ENSG00000198842 9.863211e+00 5.4879462930 5.368843e-07 1.953612e-04
ENSG00000257017 9.500927e+00 7.7139869397 8.072384e-10 7.171947e-07
I would like that the variance within the group is considered. Does anyone may help me? Some suggestion?
Is your raw data normalized?
The raw data refers to the count of reads mapping within the exons (data obtained running htseq-count). The normalization is performed with calcNormFactors(y). Am I correct?