It's no surprise to a bioinformatician that the amount of high-dimensional datasets has been increasing steadily for years now.
At present, there are hundreds of rich datasets that have many associated assays and could be appropriately/usefully stored in a MultiAssayExperiment()
object in R/Bioconductor
.
The structure of an MAE itself has several requirements that could be exploited to assist in semi-to-fully automated generation of an MAE from flat files containing clinical data, meta data, and experiment data. For instance, the primaryID used in the top level of a MAE (i.e., the output of the command colData(MAE)
) could be exploited to identify the best column to use as the row.names()
of each experiment in the experimentList.
So, it occurs to me people out there may have relatively robust algorithms for creating these. Is anyone aware of one?
Thank you!