Hi,
I am facing some difficulties with data transformations of my single cell RNA-Seq data analyzed using DESeq2. This data set is different from typical RNA-Seq experiments.. For example, there is a subset of genes which will be present in one group and totally absent in the other, unlike typical data sets where down regulated genes will still be expressed at lower level. The variation between replicates is also high, and so, we have at least ten replicates for each condition. Even with all these limitations, I am able to get a meaningful result from this analysis.
But the rlog transformation is not optimal for my analysis. I get a warning that more than 10% of the genes have outliers and it suggests doing vst.The vst works for without any warning, but I am still worried if this is optimal or if there is a way to do a better transformation for making heat maps and PCA.
Also, is there a way to extract the normalized values without any transformation? DESeq used to output this, but this function is not in the DESeq2 vignette.
Thanks for your help!
you can get the normalized counts with the function: counts(dds, normalized=T)
all the other values computed by DESeq2 can be accessed with: mcols(dds)
When you run, the genes present in on place that aren't in the another set will get
NA
in the dataset. This could generate warnings later. Remove the rows that containNA
withinpadj
with: