I am using DESeq2 on metagenomic abundance data that I have generated by mapping reads back to the assembly they created and then doing the following:
1000000 * trimmed mean depth / contig length - (75*2) * total sample depth
Trimmed mean here is the mean minus the upper and lower 5% to control for extreme values.
DESeq2 asks for un-normalized data as i believe it used total count of a sample for help control for differing depths across samples?
Therefore should I remove the normalization by total sample depth?
Or would I be better just putting the trimmed mean depth straight into DESeq2?
To add, given it is metagenomic data the abundances end up being very small so I have scaled by 1000 on all of them before turning the values into integers and passing to DESeq:
Would it be safe to assume if one pre normalizes then DESeq2 normalization should result in little to no change?
Given:
Would you therefore suggest just inputting
trimmed mean depth
taken as an integer for pseudo count input data?