Entering edit mode
6.4 years ago
valerie
▴
20
Hello biostars!
I'm trying to produce an OTU table from viral WMS libraries. The objective is to count the reads mapping on clustered contigs and use the normalized counts as abundance value of each cluster. Doing so I will have to normalize according to contig length and I was wondering if there was any tools that could help with this part.
This article uses RPKM (https://doi.org/10.3390/v16040590) and this one uses TPM (https://doi.org/10.3390/d15070813). But I myself am not sure how to normalize WGS data correctly. There are plenty of answers here on how to do it for RNA-seq. But for dna-sequenced samples I did not see any unambiguous answer.
https://doi.org/10.7717/peerj.3817 "Thus, as long as the count matrices were normalized to account for different contig lengths and library sizes, each of the five methods tested here provided reliable estimates of alpha and beta diversity."