I am running DESeq on RNAseq data. I have two conditions, three replicates for each.
I would like to find the top 100 differentially expressed genes and look at them in a heat map.
I have done estimateDispersions
with the default settings and then did nbinomTest
on the resulting data. I have a list of genes with significant p-values I was thinking were my genes of interest. However, when I look at the DESeq vignette I see that to look at the data on the heatmap, they used estimateDispersions
method=blind
.
My question is, should I be running nbinomTest
with data from estimateDispersions method pooled
or blind
or another option? Is it fair to use the default settings, choose my top 100 genes, and THEN get variance stabilized data and look at the genes in the heat map?
Thanks for any insight you may have.
Thanks for your answer. I've used both for DE testing and I seem to get different statistically significant genes between the methods. Is it reasonable to choose my genes, then do the blind dispersion, VST and heatmap? Or should the blind dispersion method determine which genes I'm calling of interest?
Definitely use pool dispersion for the actual DE testing, you'll get more reliable results that way. In the grand scheme of things, you could use either pooled or blind dispersions for the VST. You'll just get somewhat different results in the heatmap since the variance for those genes will be a bit different. The actual look of the heatmap will likely be pretty similar though.
Thanks so much, this makes sense they do look fairly similar.
Also, another follow up question, is the "blind" method designed mostly for experiments in which there are not replicates? Or what is its purpose?
Yup, mostly for experiments lacking replicates and also for heatmaps and the like.