Dear list,
I am performing an RNA-seq analysis for differential gene expression and I have a question regarding the use of the package sva for the estimation of unknown batch effects.
In the sva vignette, it shows examples of using the package for estimation of surrogate variables and then performing DE analysis using the package limma (I am referring to the section 6 of the sva vignette: "Adjusting for surrogate variables using the limma package")
Is that possible to do the same using the package edgeR instead of limma?
Or is sva not compatible with edgeR?
Sorry if this is a dumb question. I am a little new to the bioinformatics world.
Thank you!
For RNA-seq data, you should use the svaseq() function instead of sva(). That's true whether you're using limma voom, edgeR or DESeq. The author also recommends scaling and normalizing the counts before running SVA.
Basic example, assuming counts is your count matrix and clin is your clinical data file:
You can now proceed with des as your design matrix.
Hi Watson, thanks for the reply. I have a question for next steps. So then we have des as design matrix:
disp = estimateDisp (???, design)
What should we use as data for estimateDisp?