I have a matrix of species abundance per sample (across 49 samples) from a metagenomic dataset. Could this be used in deseq2 to generate a differential abundance of bin presences in samples?
I have a matrix of species abundance per sample (across 49 samples) from a metagenomic dataset. Could this be used in deseq2 to generate a differential abundance of bin presences in samples?
DESeq2 can in theory be used to analyse any dataset that consists of comparing two or more sets of counts from an experiment where counts are being used to measure something that also varies biologically between replicates. I don't know if there are particular bias' to account for in metagenomic analysis, but I'm pretty sure people have used DESeq for this.
The abundances were calculated by taking all scaffolds in a bin and looking at their abundance in a sample. I have been given this abundace matrix so not 100% sure, but it was controlled for against a normalized depth so I think this is where the no integers come from I have just multiplied the whole matrix by 100 to remove sub 1 numbers then taking the integer of them. However this does not feel correct to do. Anyway after looking I am not sure I can satisfy the fact DESeq requiers every gene to not contain atleast one zero.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I'm new to RNAseq and have yet to fully understand the statistics behind deseq myself. But i could image that the prerequisite assumptions about RNA-datasets might not hold true for your situation (e.g. are you certain that more than 50% of species have identical abundance between conditions?).
Also, how did you calculate abundance? You mentioned non-integer data, so is it percentages? The absolute size of the numbers is actually relevant. A table with every entry multiplied by 100 will yield different result than a table with every entry multiplied by 1.000.