DESeq and Limma+Voom Normalization for Rna-Seq Data Using Ercc Spike-In
1
0
Entering edit mode
9.1 years ago
komal.rathi ★ 4.1k

Hello everyone,

I have some RNASeq data, that has been spiked-in with exogenous ERCC controls. I have the counts from htseq-count and now I want to normalize + test for differential expression. For this, I am using two methods: DESeq and limma+voom. Below is my code for both the tests. Can someone tell me if what I am doing is correct or not? I do think that I am slightly mistaken in the voom code.

##### Using DESeq
library(DESeq)
# get the size factors from the spiked-ins
# dat.spike.in is count data of only spike ins
cds <- newCountDataSet(countData = dat.spike.in, conditions = as.character(condition))
cds <- estimateSizeFactors(cds)
nf <- sizeFactors(cds) 

# add the size factors to data without spike-ins
# dat.without.spike.in is count data where spike-ins are removed
cds <- newCountDataSet(countData = dat.without.spike.in, conditions = as.character(condition))
sizeFactors(cds) <- nf 

# differential expression test
cds <- estimateDispersions( cds, method= 'pooled', sharingMode='maximum', fitType='local') 
deseq.res <- nbinomTest( cds, "A", "B")

##### Using Limma + Voom
library(limma)
library(edgeR)
groups = as.factor(condition) # conditions 
design <- model.matrix(~0+groups)
colnames(design) = levels(groups)
nf <- calcNormFactors(dat.spike.in) # calculation norm factors using spike ins only
voom.norm <- voom(dat.without.spike.in, design, lib.size = colSums(dat.spike.in) * nf) # add that to voom

Awaiting your input and suggestions!

Thanks!

voom RNA-Seq spike-in deseq limma • 6.8k views
ADD COMMENT
1
Entering edit mode

I can at least say that the DESeq code is correct (though you should be using DESeq2 instead). It's been long enough since I've used limma/voom that I'll wait for someone else to comment.

ADD REPLY
0
Entering edit mode

Thank you, I still don't know if this is the best way i.e. normalizing with ERCC spike-ins. But at least this gives me some starting point.

ADD REPLY
0
Entering edit mode

In general, I would only normalize against the spike-ins if absolutely needed (there are definite cases where this is needed). They're generally less robust.

ADD REPLY
1
Entering edit mode

I am using Limma and Voom and I think I also have similar workflow like yours.

ADD REPLY
0
Entering edit mode

Thank you! Do you know of any reference papers that have used the same?

ADD REPLY
0
Entering edit mode

Why calcNormFactors only use dat.spike.in?

ADD REPLY
0
Entering edit mode

Because the ERCC spike-ins are being used for normalization. This is needed in cases like single-cell sequencing or whenever else there might be transcriptional amplification.

ADD REPLY
4
Entering edit mode
ADD COMMENT

Login before adding your answer.

Traffic: 4332 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6