Hi all
I am currently facing an issue while working with the microbiome package in R and would greatly appreciate your insights.
> b.lgg <- divergence(subset_samples(physeq, Description == "Stool_controls"),
+ apply(abundances(subset_samples(physeq, Description == "Stool_controls")), 1, median))
> b.pla <- divergence(subset_samples(physeq, Description == "Stool_samples"),
+ apply(abundances(subset_samples(physeq, Description == "Stool_samples")), 1, median))
Error in validObject(.Object) : invalid class “phyloseq” object:
Component sample names do not match.
Try sample_names()
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/Athens
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] jeevanuDB_0.0.01 RColorBrewer_1.1-3 lubridate_1.9.3
[4] forcats_1.0.0 stringr_1.5.1 dplyr_1.1.4
[7] purrr_1.0.2 readr_2.1.4 tidyr_1.3.0
[10] tibble_3.2.1 tidyverse_2.0.0 knitr_1.45
[13] microbiome_1.24.0 ggplot2_3.4.4 devtools_2.4.5
[16] usethis_2.2.2 BiocManager_1.30.22 phyloseq_1.46.0
loaded via a namespace (and not attached):
[1] bitops_1.0-7 remotes_2.4.2.1
[3] permute_0.9-7 rlang_1.1.2
[5] magrittr_2.0.3 ade4_1.7-22
[7] compiler_4.3.1 mgcv_1.9-0
[9] vctrs_0.6.5 reshape2_1.4.4
[11] profvis_0.3.8 pkgconfig_2.0.3
[13] crayon_1.5.2 fastmap_1.1.1
[15] XVector_0.42.0 ellipsis_0.3.2
[17] caTools_1.18.2 utf8_1.2.4
[19] promises_1.2.1 tzdb_0.4.0
[21] sessioninfo_1.2.2 xfun_0.41
[23] zlibbioc_1.48.0 cachem_1.0.8
[25] GenomeInfoDb_1.38.1 jsonlite_1.8.8
[27] biomformat_1.30.0 later_1.3.2
[29] rhdf5filters_1.14.1 Rhdf5lib_1.24.1
[31] parallel_4.3.1 cluster_2.1.6
[33] R6_2.5.1 stringi_1.8.3
[35] pkgload_1.3.3 Rcpp_1.0.11
[37] iterators_1.0.14 IRanges_2.36.0
[39] timechange_0.2.0 httpuv_1.6.13
[41] Matrix_1.6-4 splines_4.3.1
[43] igraph_1.6.0 tidyselect_1.2.0
[45] rstudioapi_0.15.0 vegan_2.6-4
[47] gplots_3.1.3 codetools_0.2-19
[49] miniUI_0.1.1.1 pkgbuild_1.4.3
[51] lattice_0.22-5 plyr_1.8.9
[53] Biobase_2.62.0 shiny_1.8.0
[55] withr_2.5.2 Rtsne_0.17
[57] survival_3.5-7 urlchecker_1.0.1
[59] Biostrings_2.70.1 pillar_1.9.0
[61] KernSmooth_2.23-22 foreach_1.5.2
[63] stats4_4.3.1 generics_0.1.3
[65] RCurl_1.98-1.13 hms_1.1.3
[67] S4Vectors_0.40.2 munsell_0.5.0
[69] scales_1.3.0 gtools_3.9.5
[71] xtable_1.8-4 glue_1.6.2
[73] tools_4.3.1 data.table_1.14.10
[75] fs_1.6.3 rhdf5_2.46.1
[77] grid_4.3.1 ape_5.7-1
[79] colorspace_2.1-0 nlme_3.1-164
[81] GenomeInfoDbData_1.2.11 cli_3.6.2
[83] fansi_1.0.6 gtable_0.3.4
[85] digest_0.6.33 BiocGenerics_0.48.1
[87] htmlwidgets_1.6.4 memoise_2.0.1
[89] htmltools_0.5.7 multtest_2.58.0
[91] lifecycle_1.0.4 mime_0.12
[93] MASS_7.3-60
The code showed above attempts to calculate divergence for subsets of my phyloseq object based on sample descriptions ("Stool_controls" and "Stool_samples").
I have verified that the sample names within each subset match using:
> rownames(sample_data(physeq))
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11"
[12] "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22"
[23] "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
[34] "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44"
[45] "45" "46" "47" "48" "49" "C1" "C2" "C4" "C5" "C6" "C7"
[56] "C8" "C9" "C10" "C11" "C12" "C13" "C14" "C15" "C16" "C17" "C18"
[67] "C19" "C20" "C21" "C22" "C23" "C24" "C25" "C26" "C27" "C28" "C29"
[78] "C30" "C31" "C32" "C33" "C34" "C35" "C36" "C37" "C38" "C39" "C40"
[89] "C41" "C42" "C43" "C44" "C45" "C46" "C47"
> sample_names(physeq)
[1] "1" "2" "3" "4" "5" "6" "7" "8" "9" "10" "11"
[12] "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22"
[23] "23" "24" "25" "26" "27" "28" "29" "30" "31" "32" "33"
[34] "34" "35" "36" "37" "38" "39" "40" "41" "42" "43" "44"
[45] "45" "46" "47" "48" "49" "C1" "C2" "C4" "C5" "C6" "C7"
[56] "C8" "C9" "C10" "C11" "C12" "C13" "C14" "C15" "C16" "C17" "C18"
[67] "C19" "C20" "C21" "C22" "C23" "C24" "C25" "C26" "C27" "C28" "C29"
[78] "C30" "C31" "C32" "C33" "C34" "C35" "C36" "C37" "C38" "C39" "C40"
[89] "C41" "C42" "C43" "C44" "C45" "C46" "C47"
If anyone has encountered a similar issue or has insights into why this might be happening, I would greatly appreciate your help.
If there are alternative approaches to calculate divergence for specific sample subsets in phyloseq, I am open to suggestions.
Thank you in advance for your time and assistance.