Question

search datasets in IGB

1

Entering edit mode

8.6 years ago

Arnaud Ceol ▴ 860

Hi,

in IGB, it is possible to access directly a lot of data, included ENCODE datasets (from the Data Access tab). Is it possible to search for a dataset by its name (instead of browsing the while tree)?

thanks,

Arnaud

igb • 2.1k views

ADD COMMENT • link updated 8.6 years ago by Mason Meyer ▴ 110 • written 8.6 years ago by Arnaud Ceol ▴ 860

score 3 · Accepted Answer · 2016-04-06

3

Entering edit mode

8.6 years ago

Mason Meyer ▴ 110

Hello Arnaud,

My name is Mason Meyer, support specialist for the IGB project. Thank you for taking the time to post this excellent question.

Currently, it is not possible to search the tree, but this is actually a feature planned for IGB 9.0.0, our next major release. In the meantime, we would love to hear any additional suggestions you may have for this feature, or even other features you would like to see make it into IGB 9.0.0.

Thanks again Arnaud!

ADD COMMENT • link 8.6 years ago by Mason Meyer ▴ 110

0

Entering edit mode

The basic search would be by name. It may return the datasets that match the search term, or those in a sub-directory which match this term. For instance, I've loaded the Human genome (Decembre 2013) in IGB, and enabled the UCSC DAS server. If I search "lincRNAs", I would like to see all the linkcRNAsCT* tracks from UCSC.

If you ever decide to go ever further, if would be great to add searchable meta-data to the quickload, then we could search for samples according to cell-type, tissue, etc.

ADD REPLY • link 8.6 years ago by Arnaud Ceol ▴ 860

0

Entering edit mode

For accessing the ENCODE datasets, do you think we should take the data from the UCSC system or is there a better source?

UCSC exposes their database for querying via interactive MySQL session and also allows computational access of that. In an email exchange with them, they said they were planning to discontinue the DAS service. So, we were thinking about writing an IGB "Santa Cruz Genome" App that would offer a user-friendly interface to the UCSC Genome Database - a bit like the table browser I guess - for users to browse and search the data and load data sets into IGB. The Savant genome browser attempted to do this many years ago, but that project is defunct (so far as I know) and I don't know how far they got with that.

Another option might be to use Ensembl REST APIs instead. I think that might be a better option because that could be a more stable platform on which to build.

Your comments and ideas are very welcome!

ADD REPLY • link 8.6 years ago by Ann ★ 2.4k

0

Entering edit mode

I'm not sure about the best solution for accessing encode data. Wondering if cloud solution may become a good choice: AWS already has the modEncode data, google cloud platform also has quite a lot of data: http://googlegenomics.readthedocs.io/en/latest/use_cases/discover_public_data/index.html

ADD REPLY • link 8.5 years ago by Arnaud Ceol ▴ 860