Question

What is the difference between GEO and SRA ncbi databases?

8

Entering edit mode

10.6 years ago

matija.sosic ▴ 110

I found that GEO holds "processed sequence data files", while SRA holds "raw sequence data files". In which way processed?

I am interested in rna-seq bacterial data, is GEO right resource for me?

GEO SRA ncbi • 17k views

ADD COMMENT • link updated 10.6 years ago by Chris Fields ★ 2.2k • written 10.6 years ago by matija.sosic ▴ 110

Ram · Accepted Answer · 2014-04-19

Depends on your needs. GEO contains data relevant to a particular experiment and in most cases I believe represent processed data (e.g. have been run through one or more analysis steps, such as trimming, alignment to a reference, R/BioC, etc). The example RNA-Seq from GEO is illustrative:

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE25180

If you check under the BioSamples you'll find a Cufflinks file, a SAM file, BEDGRAPH, etc. These would all be bound to a specific assembly version and whatever comes along with that (annotation, etc).

SRA however contains raw sequence data for an experiment, if you want to download and re-analyze the data on your own. They are generally all tied together via a common BioProject ID.

EDIT: In that last sentence, by 'they' I mean any data relevant to the BioProject (BioSamples, SRA, assemblies, etc).