Hello,
I am wondering if there is an efficient way to download multiple SRA samples and convert them to fasta format into one folder at once.
Since I am interested in a BioProject and there are ~200 SRA sample reads associated with it, I'd like to download all of them, then convert .sra files into fasta format to run BLAST.
Here is my process:
First, from the NBCI website, I exported a SraAccList.txt file, in which all ~200 SRA accession are listed, looks like:
SRR1
SRR2
SRR3
SRR4
...
I tried to used prefetch
from SRA Toolkit
to download all these SRA files.
Here is the command line I used:
prefetch --option-file SraAccList.txt
But it turned out that each sample is in an individual subfolder in the present working directory, but I'd like all SRA samples in one folder to parse afterward.
Then, I switched to fastq-dump
tool.
I joined all accession entries in SraAccList.txt as a long spaced-splited character, looks like:
SRR1 SRR2 SRR3 SRR4 ...
Then, I ran:
fastq-dump SRR1 SRR2 SRR3 SRR4 ...
All the resulting fastq files are in the pwd as I want, (I haven't converted fastq files to fasta files yet) but so far it feels like not efficient by doing this
fastq-dump SRR1 SRR2 SRR3 SRR4 ...
, especially when there are too many SRA to dowoload.
Hope anyone could help me with an easy way to download SRA files as fasta format into one folder at once.
Not a direct answer to your question, but rather a workaround - I strongly recommend downloading reads from ENA rather than SRA. I find it easier and faster, and you can download fastq's directly from FTP. If you need to download lots of data, I also recommend using a nice tool now called Kingfisher download, which significantly increases download speed.
also not an answer to your post , but a word of caution: do not be tempted to use blast to analyse your read data. Though Blast is a great tool it is absolutely not suited for NGS read analyses.
Hi lieven.sterck, thanks for the heads-up! I just realized I was confusing sequence reads with genome assemblies, now I am more clear about them.