Entering edit mode
5.1 years ago
ariel
▴
250
I've seen other folks use DESeq2 for performing differential abundance analysis after pathway annotation with picrust2, piphillin, tax4fun, etc.
However, when I look at my KO (KEGG Orthologs) abundances or mapped pathway abundances, they are definitely NOT ZINB (zero-inflated negative binomial) distribution, which is the model DESeq2 is based upon.
So, I'm curious what others feel about DESeq2 being the right approach, and/or if there are other normalization/analysis algorithms that may be superior.
As a note, I found that using both PICRUSt2 and PiPHILLIN that my abundance distributions tend to be bimodal.
You won't find a consensus. DESeq2 assumes negative binomial distribution which should be correct for metagenomics reads, I don't know how picrust2 etc. compute pathway abundance. The problem with DESeq2 (and all other RNAseq analysis software) is that it assumes that most of the rows are the same between the samples which is usually incorrect for metagenomics.
Your best shot is using a simple non-parametric test, although its power will be low.
Thing is, we have a lot of important covariates, so Mann-Whitney or Kruskal-Wallace won't work :(
Maybe a logistic regression for the presence of a pathway?
I like that idea, but it would sacrifice the abundance information. I'm going to try linear regression on the log aundances and see what happens. The log distributions don't look "too" bad.