I found that GEO holds "processed sequence data files", while SRA holds "raw sequence data files". In which way processed?
I am interested in rna-seq bacterial data, is GEO right resource for me?
I found that GEO holds "processed sequence data files", while SRA holds "raw sequence data files". In which way processed?
I am interested in rna-seq bacterial data, is GEO right resource for me?
Depends on your needs. GEO contains data relevant to a particular experiment and in most cases I believe represent processed data (e.g. have been run through one or more analysis steps, such as trimming, alignment to a reference, R/BioC, etc). The example RNA-Seq from GEO is illustrative:
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25180
If you check under the BioSamples you'll find a Cufflinks file, a SAM file, BEDGRAPH, etc. These would all be bound to a specific assembly version and whatever comes along with that (annotation, etc).
SRA however contains raw sequence data for an experiment, if you want to download and re-analyze the data on your own. They are generally all tied together via a common BioProject ID.
EDIT: In that last sentence, by 'they' I mean any data relevant to the BioProject (BioSamples, SRA, assemblies, etc).
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.