DESeq2 on metagenomic gene abundance data, should I normalize for total sample depth?
1
0
Entering edit mode
2.8 years ago

I am using DESeq2 on metagenomic abundance data that I have generated by mapping reads back to the assembly they created and then doing the following:

    1000000 * trimmed mean depth / contig length - (75*2) * total sample depth

Trimmed mean here is the mean minus the upper and lower 5% to control for extreme values.

DESeq2 asks for un-normalized data as i believe it used total count of a sample for help control for differing depths across samples?

Therefore should I remove the normalization by total sample depth?

Or would I be better just putting the trimmed mean depth straight into DESeq2?

Deseq2 metagenome R • 979 views
ADD COMMENT
0
Entering edit mode
2.8 years ago

The statistical method will normalize the data and assumes that you imputed original counts.

Thus, in my opinion, pre-normalizing will likely make results less reliable as it violates the assumptions that the method relies on.

ADD COMMENT
0
Entering edit mode

To add, given it is metagenomic data the abundances end up being very small so I have scaled by 1000 on all of them before turning the values into integers and passing to DESeq:

Would it be safe to assume if one pre normalizes then DESeq2 normalization should result in little to no change?

Given:

Thus, in my opinion, pre-normalizing will likely make results less reliable as it violates the assumptions that the method relies on.

Would you therefore suggest just inputting trimmed mean depth taken as an integer for pseudo count input data?

ADD REPLY

Login before adding your answer.

Traffic: 2188 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6