Entering edit mode
18 months ago
Assa Yeroslaviz
★
1.9k
Hi,
I would like to make sure I understand how to do it, if I have a data set with ERCC spike-ins in bulk-RNASeq
After integrating the ERCC sequences into the genome (both fastA and gtf) in question, I can just map the fastq files as always. I can then quantify the resulted bam files using e.g. featureCounts
to get my raw count table.
To get the size.factors
for the ERCCs, Do I need to sub-set the count table to only the ERCC "genes" and than calculate them alone?
Or can I use the
dds <- estimateSizeFactors(dds, controlGenes= <names or numeric index of my ERCC features> )
dds <- DESeq(dds, ...)
thanks
It's the same, see https://github.com/mikelove/DESeq2/blob/devel/R/core.R#L559-L577
controlGenes
internally subsets the dds object to the controls and derives the size factors from that. Random note: You can do that in edgeR as well, so calculating the TMM factors with any subset of genes and then putting them back to the main DGEList, just that you have to do it manually whereas DESeq2 has this convenience argumentcontrolGenes
.