Download multiple SRA samples and convert into fasta format at once
0
0
Entering edit mode
3.4 years ago
twangxxx • 0

Hello,

I am wondering if there is an efficient way to download multiple SRA samples and convert them to fasta format into one folder at once.

Since I am interested in a BioProject and there are ~200 SRA sample reads associated with it, I'd like to download all of them, then convert .sra files into fasta format to run BLAST.

Here is my process:

First, from the NBCI website, I exported a SraAccList.txt file, in which all ~200 SRA accession are listed, looks like:

SRR1
SRR2
SRR3
SRR4
...

I tried to used prefetch from SRA Toolkit to download all these SRA files. Here is the command line I used:

prefetch --option-file SraAccList.txt

But it turned out that each sample is in an individual subfolder in the present working directory, but I'd like all SRA samples in one folder to parse afterward.

Then, I switched to fastq-dump tool.

I joined all accession entries in SraAccList.txt as a long spaced-splited character, looks like:

 SRR1 SRR2 SRR3 SRR4 ...

Then, I ran:

fastq-dump SRR1 SRR2 SRR3 SRR4 ...

All the resulting fastq files are in the pwd as I want, (I haven't converted fastq files to fasta files yet) but so far it feels like not efficient by doing this

fastq-dump SRR1 SRR2 SRR3 SRR4 ..., especially when there are too many SRA to dowoload.

Hope anyone could help me with an easy way to download SRA files as fasta format into one folder at once.

fasta SRA fastq-dump prefetch • 1.8k views
ADD COMMENT
3
Entering edit mode

Not a direct answer to your question, but rather a workaround - I strongly recommend downloading reads from ENA rather than SRA. I find it easier and faster, and you can download fastq's directly from FTP. If you need to download lots of data, I also recommend using a nice tool now called Kingfisher download, which significantly increases download speed.

ADD REPLY
2
Entering edit mode

also not an answer to your post , but a word of caution: do not be tempted to use blast to analyse your read data. Though Blast is a great tool it is absolutely not suited for NGS read analyses.

ADD REPLY
0
Entering edit mode

Hi lieven.sterck, thanks for the heads-up! I just realized I was confusing sequence reads with genome assemblies, now I am more clear about them.

ADD REPLY

Login before adding your answer.

Traffic: 1542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6