I am trying to download either full genomes or wgs assembled sequences (depending on what is available) of several drosophila species.
For most species, I was able to find an entry in the NCBI Genome database (e.g., http://www.ncbi.nlm.nih.gov/genome/genomes/3489 ?) that linked to a wgs download page in zipped fasta format (http://www.ncbi.nlm.nih.gov/Traces/wgs/?val=AFFE02 downloads tab). They were all around 50 megabytes zipped.
However, several species were not available in the Genome database and I was only able to find them in the SRA database. When downloaded and converted to fastq format, they ended up being very large files (three were around 10 gigs, one was 26 gigs) and this seemed strange to me in comparison with the 50 mb archives.
Why are the .sra and fastq files so much larger than the zipped wgs files?
Thanks!