Question

DESeq and Limma+Voom Normalization for Rna-Seq Data Using Ercc Spike-In

0

Entering edit mode

10.0 years ago

komal.rathi ★ 4.1k

Hello everyone,

I have some RNASeq data, that has been spiked-in with exogenous ERCC controls. I have the counts from htseq-count and now I want to normalize + test for differential expression. For this, I am using two methods: DESeq and limma+voom. Below is my code for both the tests. Can someone tell me if what I am doing is correct or not? I do think that I am slightly mistaken in the voom code.

##### Using DESeq
library(DESeq)
# get the size factors from the spiked-ins
# dat.spike.in is count data of only spike ins
cds <- newCountDataSet(countData = dat.spike.in, conditions = as.character(condition))
cds <- estimateSizeFactors(cds)
nf <- sizeFactors(cds) 

# add the size factors to data without spike-ins
# dat.without.spike.in is count data where spike-ins are removed
cds <- newCountDataSet(countData = dat.without.spike.in, conditions = as.character(condition))
sizeFactors(cds) <- nf 

# differential expression test
cds <- estimateDispersions( cds, method= 'pooled', sharingMode='maximum', fitType='local') 
deseq.res <- nbinomTest( cds, "A", "B")

##### Using Limma + Voom
library(limma)
library(edgeR)
groups = as.factor(condition) # conditions 
design <- model.matrix(~0+groups)
colnames(design) = levels(groups)
nf <- calcNormFactors(dat.spike.in) # calculation norm factors using spike ins only
voom.norm <- voom(dat.without.spike.in, design, lib.size = colSums(dat.spike.in) * nf) # add that to voom

Awaiting your input and suggestions!

Thanks!

voom RNA-Seq spike-in deseq limma • 7.9k views

ADD COMMENT • link updated 3.3 years ago by Ram 45k • written 10.0 years ago by komal.rathi ★ 4.1k

1

Entering edit mode

I can at least say that the DESeq code is correct (though you should be using DESeq2 instead). It's been long enough since I've used limma/voom that I'll wait for someone else to comment.

ADD REPLY • link 10.0 years ago by Devon Ryan 105k

0

Entering edit mode

Thank you, I still don't know if this is the best way i.e. normalizing with ERCC spike-ins. But at least this gives me some starting point.

ADD REPLY • link updated 6.0 years ago by Ram 45k • written 10.0 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

In general, I would only normalize against the spike-ins if absolutely needed (there are definite cases where this is needed). They're generally less robust.

ADD REPLY • link 10.0 years ago by Devon Ryan 105k

1

Entering edit mode

I am using Limma and Voom and I think I also have similar workflow like yours.

ADD REPLY • link 10.0 years ago by bharata1803 ▴ 580

0

Entering edit mode

Thank you! Do you know of any reference papers that have used the same?

ADD REPLY • link updated 6.0 years ago by Ram 45k • written 10.0 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

Why calcNormFactors only use dat.spike.in?

ADD REPLY • link updated 6.0 years ago by Ram 45k • written 10.0 years ago by Zhilong Jia ★ 2.2k

0

Entering edit mode

Because the ERCC spike-ins are being used for normalization. This is needed in cases like single-cell sequencing or whenever else there might be transcriptional amplification.

ADD REPLY • link 10.0 years ago by Devon Ryan 105k

score 4 · Accepted Answer · 2015-11-20

4

Entering edit mode

10.0 years ago

Gordon Smyth ★ 8.5k

https://support.bioconductor.org/p/74870/

ADD COMMENT • link 10.0 years ago by Gordon Smyth ★ 8.5k