I have a metagenomic dataset crossing three time points from which I have mined CAZymes and am using DESeq2 to identify differentially abundant CAZymes from using trimmed mean depth generated my CoverM (very similar to Q2Q3 contig coverage). From this I am see results I do not understand in the form of enzymes that look differentially abundant but DESeq2 says the are not. below is an example. I would appriciate anyone who could enlighten me as to why this is the case.
When looking at a relative abundance* heatmap of a subset of the CAZyme data with those marked as differentially abundant across any time point I notice on the far right several enzymes which appear to the naked eye to be differentially abundant. Endo−1,4−beta−mannanase for example on the far right
When exploring the DESeq2 output for that given enzyme I see the following:
#Between time point 1 and 2
PreVsP_df[rownames(PreVsP_df) == "3.2.1.78",]
baseMean log2FoldChange lfcSE stat pvalue padj
3.2.1.78 0.5565868 0.2389833 0.8940182 0.2673137 0.7892277 0.8563372
#Between time point 1 and 3
PreVsF_df[rownames(PreVsF_df) == "3.2.1.78",]
baseMean log2FoldChange lfcSE stat pvalue padj
3.2.1.78 0.5565868 -1.132044 0.9063071 -1.249074 0.2116381 0.3301718
As we see here there is no significant difference. However when I explore relative abundance values of this enzyme individually I do not understand why this is the case. Pre = time point 1, Post = time point 2 and Field = time point 3:
Can you show an MA-plot? Use the
plotMA()
function. The baseMean looks very low, that might be an issue here.Apologies for delayed response. Here is the plotMA for the whole DESeq2 results:
Here is the plotMA results
results(dds_cazyme, contrast = c("Timepoint", "PRE", "POST"))
. That being timepoint 1 and 2:here is also the
plotCounts()
of the gene:It seems to be low counts and quite some variability, so not enough that DESeq2 considers this as differential. By the way, I have no idea why this MA-plot looks so weird and asymmetric, but I recall mutliple threads in which the DESeq2 developer expressed serious concerns on even using DESeq2 for abundance analysis as he (afair) doubts that this is a good fit for a negative binomial that DESeq2 builds on.