Where Can I Find Fastq Data (Ngs Raw Data) And Published Results?
4
10
Entering edit mode
14.1 years ago
Orca ▴ 140

I would like to reproduce some published results with my own analysis pipeline, but I need the corresponding datasets downloadable. I have to validate my pipeline. If someone has another idea... Let me know!!

Orc@

next-gen sequencing data analysis • 17k views
ADD COMMENT
0
Entering edit mode

What kind of analysis are you trying to perform?

ADD REPLY
0
Entering edit mode

same question here: I'd like to find a set of fastq files related to a given article to show my students how to process this kind of data.

ADD REPLY
0
Entering edit mode

Thanks for your answers, but I'm looking for an article (published results) in which the raw data are available.

ADD REPLY
0
Entering edit mode

I would like to perform some analysis of Ins/Del, SNP on human genome.

ADD REPLY
0
Entering edit mode

I have a question. Since it is very related I chose to post it here.

Q: So when you say that you want to validate your analysis pipeline with published/publicaly available data. Do you get any rights to publish your results based on your analysis of somebody elses's data. What are the norms to use publicaly available NGS data?

ADD REPLY
6
Entering edit mode
14.1 years ago

From the NCBI Sequence Read Archive. To obtain Fastq format see the relevant section in the SRA handbook for which you will probably need the SRA Toolkit.

ADD COMMENT
6
Entering edit mode
14.1 years ago
User 59 13k

You could also look into the European Nucleotide Archive at the EBI.

ADD COMMENT
5
Entering edit mode
14.1 years ago

NOTE: Shameless plug for our software....

You could have a look at the SRAdb R/Bioconductor package. We pull down all the metadata from the sequence read archives at EBI, NCBI, and DDBJ and consolidate that into a SQLite file that can be used from R or any other language that has a SQLite interface.

ADD COMMENT
0
Entering edit mode

A great idea ! I will use it as soon as possible. Thanks

ADD REPLY
0
Entering edit mode

Is there a SQLite interface for perl ?

ADD REPLY
5
Entering edit mode
13.5 years ago

If you are looking for non-human sequences you can use the European Nucleotide Archive at EBI. But if the papers that you are looking at, are from humans, then you need to go to the European Genotype Phenotype Archive at EBI EGA or the datatabase of Genotypes and Phenotypes at NCBI dbGaP. Be aware that despite of being public, almost all the human data from research studies are under consent agreement rules, so you need to ask for access first before able to access or use the data. Unless you want to replicate a study, it would be easy to use data from 1000 genomes or similar projects that you can download directly from the 1kg web site.

As a side note: for downloading this BIG data sets is better to use aspera than ftp when this possibility is provided (see 1kg data access)

ADD COMMENT

Login before adding your answer.

Traffic: 2266 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6