sva + egdeR - differential expression analysis - RNA-seq data
2
4
Entering edit mode
8.9 years ago

Dear list,

I am performing an RNA-seq analysis for differential gene expression and I have a question regarding the use of the package sva for the estimation of unknown batch effects.

In the sva vignette, it shows examples of using the package for estimation of surrogate variables and then performing DE analysis using the package limma (I am referring to the section 6 of the sva vignette: "Adjusting for surrogate variables using the limma package")

Is that possible to do the same using the package edgeR instead of limma?

Or is sva not compatible with edgeR?

Sorry if this is a dumb question. I am a little new to the bioinformatics world.

Thank you!

RNA-Seq edgeR sva • 7.9k views
ADD COMMENT
1
Entering edit mode

For RNA-seq data, you should use the svaseq() function instead of sva(). That's true whether you're using limma voom, edgeR or DESeq. The author also recommends scaling and normalizing the counts before running SVA.

Basic example, assuming counts is your count matrix and clin is your clinical data file:

y <- DGEList(counts)
y <- calcNormFactors(y)
mod <- model.matrix(~ Condition, data=clin)
mod0 <- model.matrix(~ 1, data=clin)
svobj <- svaseq(cpm(y), mod, mod0) 
des <- cbind(mod, svobj$sv)

You can now proceed with des as your design matrix.

ADD REPLY
0
Entering edit mode

Hi Watson, thanks for the reply. I have a question for next steps. So then we have des as design matrix: disp = estimateDisp (???, design) What should we use as data for estimateDisp?

ADD REPLY
1
Entering edit mode
8.9 years ago
h.mon 35k

In my (shallow) understanding, no: sva manual suggests a log( g[ij] + c ) transformation, whereas edgeR uses the negative binomial to model read counts, and specifically states that only read counts should be used. You may use sva + voom + limma; or including batch effects on your glm model and proceedign with edgeR.

ADD COMMENT
1
Entering edit mode
8.9 years ago

You end up just adding columns to your model matrix in edgeR. Here's a similar discussion about SVA and DESeq2: Batch effect in DESeq2 - multiple factor or SVA?

The same principles apply.

ADD COMMENT

Login before adding your answer.

Traffic: 1913 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6