Hi there community!
For some time I was working os 16S rRNA gene survey data. For this type of analysis one could use a rarefaction approach in order to have the same depth for each sample. Having different depths for each sample is sometimes referred to as searching 1 square meter of amazon jungle and 1 square kilometer of mojave desert and then comparing OTUs, taxons, etc... It is relatively easy to employ a rarefaction, as it is implemented in many software packages: qiime, mothur.
I have now a shotgun dataset - a whole genome sequencing of microbiome. For the start I am using a microbiome helper SOP. For taxonomy assignement I use MetaPhlAn2 approach. MetaPhlAn2 wiki doesn't even mention rarefaction. Since this step might be crucial for comparative analyses, where I have two groups/categories, each containing around 30 samples I want to have each sample as "standardized" as possible. Are there any approaches two rarefy WGS data? Is there a reason why I has not been yet implemented in for example MetaPhlAn2?
I'd be grateful for any insight, comments and suggestions.
Hi, Did you find any solution to this problem? Any suggestion on how to compute diversity with followed by metaphlan2?
Thanks for this post robert.kwapich, this is a critical step if u wanna compare groups of samples that have been shotgun-metagenome sequenced! My intitial instinct was to rarefy based on single copy housekeeping bacterial genes or the ykaryotic contamination but i dont wanna reinvent the wheel if there is already a method available! Cheers!