The vast amount of RNA-seq data deposited in Gene Expression Omnibus (GEO) and Sequence Read Archive (SRA) is still a grossly underutilized resource for biomedical research. To remove technical roadblocks for re-using these data, we have developed a web-application GREIN (GEO RNA-seq Experiments Interactive Navigator) which provides simple user-friendly interfaces for manipulating and analyses of GEO RNA-seq data. GREIN is powered by the back-end computational pipeline for uniform processing of RNA-seq data and the large number (>180,00) of already processed samples. We use an R-based automated pipeline, GREP2 (GEO RNA-seq Experiments Processing Pipeline) to process available RNA-seq samples of human, mouse, and rat from GEO. The front-end user interfaces provide a wealth of user-analytics options including sub-setting and downloading processed data, interactive visualization, statistical power analyses, construction of differential gene expression signatures and their comprehensive functional characterization, connectivity analysis with LINCS L1000 data, or submit for processing data sets that have not yet been processed. The combination of the massive amount of back-end data and front-end analytics options driven by user-friendly interfaces makes GREIN a unique open-source resource for re-using GEO RNA-seq data. GREIN is freely accessible at: https://shiny.ilincs.org/grein.
The conceptual outline of GREP2 and GREIN:
The pre-print version of the paper can be found here.
Looks very usefull - thanks for sharing!
I have two questions:
1) Are transcript level estimats available?
2) Why is amongst others GSE26024 not in there?
Hello, Thanks for your feedback and queries. 1. Transcript level estimates are available. If you select any dataset and go to the counts table tab, both gene and transcript level estimates are downloadable from there. 2. The SRA accession ID for the RNA-seq samples of GSE26024 is not available in GEO and as such we couldn't process this dataset.
That's really cool! What about transcript level abundances? Those would often be nicer to have as they are corrected for all the biases...