Question

edgeR direction of expression and sign of Log Fold Changes

0

Entering edit mode

8.8 years ago

ilyco ▴ 60

Hi,

I used edgeR for differential expression analysis with 5 conditions relative to the baseline condition.

The design was a simple linear model with the condition factor variable re-ordered so that the baseline is the first value. The code was the following:

design <- model.matrix(~ condition, data = y$samples)
y <- estimateDisp(y, design, robust=TRUE) 
fit <- glmFit(y,design)
conditionA_minus_base <- glmTreat(fit, coef = "conditionA", lfc = minlfc) ### coefficient corresponds to A - baseline
up_A <- rownames(y)[decideTestsDGE(conditionA_minus_base, p.value = pvalue, adj = "fdr") == 1]
down_A <- rownames(y)[decideTestsDGE(conditionA_minus_base, p.value = pvalue, adj = "fdr") == -1]

However, when I checked the genes down-regulated, they are enriched for many terms which are known to be up-regulated. All in all, directions seems reversed for a large majority of genes. I checked labeling and pre-processing steps many times. Could you please let me know if the 1 and -1 values should be the other way around?

I tested two designs against each other:

design1 <- model.matrix(~ condition, data = y$samples)
design2 <- model.matrix(~ 0 + condition, data = y$samples)

Results are the same from:

 conditionA_minus_base1 <- glmTreat(fit, coef = "conditionA", lfc = minlfc) ### coefficient corresponds to A - baseline
 conditionA_minus_base2 <- glmTreat(fit, contrast = c(-1,1,0,0,0,0), lfc = minlfc)

where contrast = c(-1,1,0,0,0,0) coresponds to -1Baseline + 1 Condition A

Thank you.

RNA-Seq edgeR R LFC • 3.8k views

ADD COMMENT • link updated 8.7 years ago by Gordon Smyth ★ 7.9k • written 8.8 years ago by ilyco ▴ 60

0

Entering edit mode

As long as the ordering is correct then what you're doing should work. The most common mistake here is when making the condition column in y$samples. Triple check that nothing is swapped there (hint: if you aren't already, load this from a text file).

ADD REPLY • link 8.8 years ago by Devon Ryan 105k

score 0 · Answer 1 · 2016-07-24

0

Entering edit mode

8.7 years ago

Gordon Smyth ★ 7.9k

You code looks correct. Your up_A does contain genes up-regulated in condition A vs whatever you set for the reference level of 'condition', and down_A does correspond to down-regulated in condition A.

ADD COMMENT • link 8.7 years ago by Gordon Smyth ★ 7.9k