open source webserver for multi-omics data store and descriptional visualization
1
0
Entering edit mode
2.0 years ago
Zhilong Jia ★ 2.2k

Any open-source web server (or framework) for multi-omics and sample metadata management and descriptional visualisation is available?

This kind of webserver can be used to store the raw data of multi-omics (such as genomics, transcriptomics, proteomics, metabolomics, microbiome, epigenome), key omics files (e.g. path to files), such as vcf, matrix of expression. Meanwhile, a descriptional visualisation of those matrix data will be better. Thank you.

webserver multi-omics open-source • 808 views
ADD COMMENT
1
Entering edit mode
2.0 years ago

Gen3 is probably the closest to what you were thinking of. Self-hosting is possible, but you will probably need full-time engineer(s) in your organization to set up and maintain the system.

As soon as the amount of your data reaches a level where the benefits of having such a system outweighs the effort of setting one up, customizing and maintaining it unfortunately warrants full-time engineers. Here, a LinkedIn engineer nicely elaborates on the challenges of building and maintaining such a system. Many large companies have worked on internal tooling to make datasets discoverable across the whole organization by gathering all metadata into a central data catalogue, and basically all ended up building their own custom systems to meet their demands. Thus, there is no shortage of open-source systems you could customize to manage your metadata, but none will work out of the box and still require substantial work on your side:

There are also some other efforts to build data platforms with a biology/genomics focus, but as far as I know the Elixir Data Catalogue, the European Genomic Data Infrastructure (GDI) and the German Human Genome-Phenome Archive are all work in progress.

For raw data storage, you could also take a look at Hail, but maybe a simple object store with a mantis index is already sufficient for your needs. If you need to serve bioinformatic file formats via a network, various implementations of the htsget protocol (e.g. in Rust) are available. For data versioning, Restic respectively it's reimplementation Rustic could be a relatively straightforward solution that also works without much overhead on the level of single workgroups.

ADD COMMENT

Login before adding your answer.

Traffic: 4461 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6