Hi, I have a large 16s microbiome dataset with some 600 samples spanning across 5 clinical groups which consists of around 6000 OTUs. I have performed an NMDS analysis with Vegan and I would like to see if the sample distances are associated with any of my metadata such as library size/gender etc., using PERMANOVA (adonis).
I've calculated the pairwise Bray Curtis distances with vegdist:
dist.dat = vegdist(otuTable,method="bray", na.remove = TRUE)
My first question relates to whether I should be removing rare species before calculating the sample wise distances. Most of the OTUs in my OTU table are very rare, although as I understand it, bray curtis is fairly robust to this and focuses mostly on abundant species. I have included all OTUs as not to impose an arbitrary abundance threshold. Could someone comment on if this is admissible?
Next I’ve used adonis to determine if the distance between samples is associated with any of my metadata, for example library size:
adonis(dist.dat ~ otuDescription$libSize)
Call:
adonis(formula = dist.dat ~ otuDescription$libSize)
Permutation: free
Number of permutations: 999
Terms added sequentially (first to last)
Df SumsOfSqs MeanSqs F.Model R2 Pr(>F)
otuDescription$libSize 1 3.547 3.5472 9.2281 0.01396 0.001 ***
Residuals 652 250.622 0.3844 0.98604
Total 653 254.169 1.00000
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
For which I get a significant P value. The p value is a function of sample size and thus permanova results should be discussed in context of the R2 which in my case is very low ~ 0.013. Am I correct in interpreting that this is not an important factor?
Thanks, Dave