Where can I find the raw sequencer output of two strains of S. pneumoniae (FASTQ format) along with their corresponding reference genomes (FASTA format)?
1
2
Entering edit mode
10.4 years ago
John Smith ▴ 320

I am looking for any two strains of S. pneumoniae for which I can find their raw output sequencer data in FASTQ format and for which I can find their reference genomes (since I plan to align the raw data in the FASTQ format to the reference genomes using Bowtie 2). I know that I could make artificial FASTQ files using a generator but I need to work with data that came out of a real sequencer.

I tried looking in NCBI, Ensembl, and ArrayExpress and I did find reference genomes for some strains such as R6 and others. However, finding the raw data straight out of a sequencer (FASTQ format) seems a hard task.

In summary, where can I find any two strains of S. pneumoniae for which can I find the aforementioned raw data (I assume that there must be popular strains for which this is more easily found than for others) along with their reference genomes?

genome sequencing fastq RNA-Seq fasta • 2.6k views
ADD COMMENT
3
Entering edit mode
10.4 years ago
Dan D 7.4k

You can get FASTQ and FASTA here.

Most strains will provide the raw data associated with the study. For example: http://www.ebi.ac.uk/ena/data/view/ERP000241

ADD COMMENT
0
Entering edit mode

I am having trouble understanding the different categories on the left of the search results (I am still new to bioinformatics). Should I look for the reference genome within "Assembly"? And should I look for the raw sequencer output within "Read" or within "Study"? Moreover, what is the difference between "Experiment" and "Run" and between "Study" and "Study (Sequence)"? My last question would be what is the difference between "Update" and "Release" next to some categories?

ADD REPLY

Login before adding your answer.

Traffic: 2107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6