Question

What's the difference between WGS and RefSeq databases?

2

Entering edit mode

8.5 years ago

gbdias ▴ 160

I read the Refseq documentation in the NCBI handbook but it is still not clear to me. I'm aware WGS represents all assembled contigs from a sequencing project, and Refseq supposedly has some curation...

Does that mean WGS is more complete than Refseq (even if it includes a bunch of unnannotated features)?

ncbi refseq wgs • 2.2k views

ADD COMMENT • link updated 8.5 years ago by Denise CS ★ 5.2k • written 8.5 years ago by gbdias ▴ 160

score 2 · Accepted Answer · 2016-05-31

2

Entering edit mode

8.5 years ago

Denise CS ★ 5.2k

I'd not think those things are comparable really, as they mean different things. Annotation is only possible when the sequences are available. RefSeq and others provides the annotation of these sequences (e.g. the Ensembl gene set), whether they are assembled or not (yet). The genomic sequence comes from Whole Genome Sequencing (WGS) experiments and we carry out the annotation of genes, transcripts, genetic variants, regulatory regions, etc.

ADD COMMENT • link 8.5 years ago by Denise CS ★ 5.2k

0

Entering edit mode

Thank you for the explanation. What if I wanted to find all ERVs in a primate genome, for example. Knowing that most of these sequences are not annotated, the WGS is the option to go, right? I mean, the Refseq would not include non-annotated non-protein-coding sequences even if they are assembled in the WGS, would it?

ADD REPLY • link 8.5 years ago by gbdias ▴ 160