Question

Using offsets in edgeR to correct GC content

0

Entering edit mode

8.9 years ago

Dawe ▴ 270

Hi all, I'm performing some differential analysis with edgeR and I need to take into account GC content in my regions. From edgeR manual I see I should set GC percentage as "offset" in linear model. Unfortunately there is very little documentation about this, so I need some hint.

Offsets can be passed both to estimateDisp and glmFit functions, but it is not clear how they are handled in the two steps. Setting offset in estimateDisp, i.e.

y = DGEList(…)
y = calcNormFactors(y)
y$offset = gc_content #this is a vector with %GC for every region in y 
y = estimateDisp(y, design)

increases the dispersion values (common, tagwise and trended) in the DGEList object, with negative effect on final pvalues and FDR. Setting it at GLM, i.e.

y = DGEList(…)
y = calcNormFactors(y)
y = estimateDisp(y, design)
fit = glmFit.default(y, design, offset=gc_content, dispersion=y$tagwise.dispersion)

does exactly the opposite: all p-values and FDR become "incredibly" significant, also average logCPM for each region is increased 10-fold (which, I suspect, is the reason why I have lower p-values).

What would be the proper way to include GC information in edgeR?

d

edger gc linear models R • 3.5k views

ADD COMMENT • link updated 8.6 years ago by javiergutierreza ▴ 40 • written 8.9 years ago by Dawe ▴ 270

0

Entering edit mode

Do you have a notable GC bias between some of the samples? If not, don't bother doing this.

ADD REPLY • link 8.9 years ago by Devon Ryan 105k

0

Entering edit mode

Influence of GC in standard differential expression analysis is limited, since you are comparing gene A versus gene A. Even if it's a high GC gene, it's the same for both samples and therefore no need to correct for.

ADD REPLY • link 8.9 years ago by WouterDeCoster 48k

score 1 · Answer 1 · 2016-09-21

1

Entering edit mode

8.6 years ago

javiergutierreza ▴ 40

Check the EDASeq manual (http://www.bioconductor.org/packages/release/bioc/html/EDASeq.html). You will find a perfect example for your question.

ADD COMMENT • link 8.6 years ago by javiergutierreza ▴ 40