I'm looking into the differentially expressed genes in heat-stressed animals versus normal ones. I have a data matrix of 40 samples of different conditions and using contrast I compare each one to its the control.
I want to look at the heat stressed the other 3 are non-stressed old versus young.
dds <- DESeqDataSetFromMatrix(countData = data, colData = samples, design = ~0+condition)
dds_lrt <- DESeq(dds, test="LRT", reduced= ~ 1)
and in my DEseq contrasts
stress <- results(dds_lrt, contrast = c("condition", "heat-stressed", "young"))
ageing <- results(dds_lrt, contrast = c("condition", "5.months.old", "young"))
by applying that I got hundreds of significant genes. Is this a right way of doing it and the volcano blot is weird ![enter image description here][1]
However if i did wald
dds_lrt <- DESeq(dds, test="Wald")
and in my DEseq contrasts
stress <- results(dds_lrt, contrast = c("condition", "heat-stressed", "young"))
ageing <- results(dds_lrt, contrast = c("condition", "5.months.old", "young"))
I got very less significant genes but the volcanoplot looks realistic
Could you explain what is the mistake here?
Can you explain exactly why you are doing LRT as opposed to Wald?
I wasn't sure which one to use as usually I use LRT but i found they caused huge difference especially in padj value but im not sure which is the correct one to use
If you don't understand, you should stick to what the vignette and other tutorials use in your situation.
Adding on that, using
~0+...
is non-standard in DESeq2, I am unsure how this plays with the LRT in combination with the reduced model. My recommendation would as well be to strictly stick to the vignette unless you have expert knowledge to deviate from it.The models LRT tests should be nested, here you compare two different models which could be very different.