Unless I am mistaken, there does not appear to be a stable repository for archiving RNAseq data, akin to how GEO and ArrayExpress archive microarray data. We are preparing to publish some mRNASeq results and I would like some input as to the best ways to both present and archive the raw and processed data. Currently the thinking (aside from GSEA in the paper) is to include the cufflinks processed data as a supplementary table. The main questions I have are:
- How to make the raw short-reads available?
- Is it worthwhile to make the aligned (bam) files available and if so, how?
- Other than software, software versions, non-default parameters and hardware information, what technical data should be provided in the manuscript
- If I am going to archive the short reads or alignment files, what is the best way to attach the relevant metadata about the samples?
all true. Just "Alignment files (e.g. BAM, SAM) should not be supplied as processed data files." http://www.ncbi.nlm.nih.gov/geo/info/seq.html
I think they think its too much.Any sequence files submitted to GEO end up in SRA with a link.
Ah, I missed that. Thanks. Still, I'm positive I have used alignment files from GEO before (may have been Eland output rather than BAM/SAM) so maybe they used to allow it but feel it's too much now (which is understandable).
I would also add quantified data (isoform/genes) - makes it a lot easier for the rest of us to quickly test hypothesis on your data.