Question

How to download oxford nanopore sequencing fast5 files from SRA

0

Entering edit mode

5.0 years ago

va90 ▴ 50

Hi,

I am trying to download direct RNA seq data produced by oxford nanopore sequencing from SRA (SRP174366). I know how to use sratoolkit prefetch or fastq-dump to download fastq files, but I do not know how to download nanopore fast5 files from SRA?

Any help would be much appreciated,

Thanks, Vahid.

RNA-Seq • 7.2k views

ADD COMMENT • link updated 5.0 years ago by GenoMax 147k • written 5.0 years ago by va90 ▴ 50

score 2 · Answer 1 · 2019-12-01

2

Entering edit mode

5.0 years ago

GenoMax 147k

If you go to the SRA record of an individual accession from the project link that you posted above, you can click on the run ID and finally get to the Data Access tab on the new page that opens. You will find data in original fast5 format under relevant section at bottom of that tab. There are links to AWS/GCP. You can use the AWS links with wget/curl to download the data to your local server. If you can use google compute then the link for the data in google storage is also available.

ADD COMMENT • link 5.0 years ago by GenoMax 147k

0

Entering edit mode

Is there a way to do this via command line (if not using AWS)? I used SRA Toolkit to grab fastq files for Illumina sequences, but I need a few Nanopore sequences as well now. I'd like to keep doing everything from command line, but it seems I can't do it with SRA Toolkit. Wondering if there is another way... (EDIT: Never mind. I found out from a colleague that NCBI normally prefers fastq instead of fast5 because of the size and format of fast5s. So the fast5 files are unavailable and I am just assuming that the only step that has been applied is conversion from fast5 to fastq, and no cleaning or trimming done).

ADD REPLY • link 4.1 years ago by sovrappensiero ▴ 100

0

Entering edit mode

How did you end up getting the Nanopore sequencing data off of SRA in fastq format? fastq-dump from SRA toolkit is giving me a segmentation fault error

ADD REPLY • link 3.9 years ago by mmacd ▴ 20

0

Entering edit mode

Which accession number are you looking at? If you look at the Data access tab of that accession you may be able to download the fastq files directly.

ADD REPLY • link 3.9 years ago by GenoMax 147k

0

Entering edit mode

I am looking at SRR2037194 and SRR1980727. The data access tab as far as I can see does not give access to fastq files (I got a binary file when I downloaded via data access, which I am assuming is in SRA format). I was able to use prefetch to get the SRA file as well but when I use fastq-dump, it still gives me a segmentation error.

ADD REPLY • link 3.9 years ago by mmacd ▴ 20

0

Entering edit mode

fastq-dump -X 25 -F SRR2037194 worked for me. If you can't make it work using sratoolkit then you could get fastq files from ENA/EBI. Links below.

SRR2037194 (LINK)
SRR1980727 (LINK)

ADD REPLY • link 3.9 years ago by GenoMax 147k

0

Entering edit mode

The fastq-dump command still doesn't work for me. But the ENA/EBI links will work great, thank you so much for your help!

ADD REPLY • link 3.9 years ago by mmacd ▴ 20