Hi All,
I'm trying to assemble several dozen prokaryotic genomes using SPAdes. My inputs are paired end illumina reads (2x125). I've learned how to use the software but am unfamiliar with programming - when it comes to bioinformatics, I just know basic unix commands and how to navigate and manipulate files and directories in my university's linux server.
The command in SPAdes I use for a single genome assembly is:
spades.py --careful -1 my_forward.fastq.gz -2 my_reverse.fastq.gz -o /my/output/directory
It seems time consuming to run each genome assembly one by one. Is there a way to run the entire set of separate genome assemblies in one go, so as to save time and trouble? Do I need to know python script? I would appreciate your input, thank you!
Hello,
I am trying to combine multiple files into one big assembly with spade, so that I just get one scaffold file.
The responses above are helpful for multiple assemblies, but I am just aiming for one.
I appreciate any suggestions that I can get, thanks!
Merge the paired reads files in to two files:
For lots of PE reads file:
Merge them:
PS: replace with
pigz
if you install it, which is much faster thangzip
.PS2:
gzip -d -c
is equal tozcat
.PS3: if you have decompress
.gz
file, justcat *_1.fq > merged_1.fq
@shenwei356, thanks alot for your help! :)
You should probably ask this as a separate question, not an answer to another thread..
@jrj.healey, noted! This is my first time posting, thanks. :)