Hi all,
I downloaded multiple paired-end reads from the SRA (NCBI) and I want to assemble two paired-end read files with each other. All the files look like this:
SRRXXX_1.fastq
SRRXXX_2.fastq
SRRYYY_1.fastq
SRRYYY_2.fastq
SRRWWW_1.fastq
SRRWWW_2.fastq
First, I use Megahit to do so, is that appropriate for what I want to do?
Then, I tried this:
for file in *.fastq;
do
f=$(basename $file)
megahit -1 "$file"_1.fastq -2 "$file"_2.fastq -o "$file"_assembled.fasta/;
done;
But it didn't work because the file names aren't correct and it makes sense because _1.fastq or _2.fastq are added to the original file names. Unfortunately, I don't know how to proceed differently ...
How could I assemble both paired-end read files at the same time and do this for all files?
Thank you for your help, it will be greatly appreciated!
I don't know what exactly you mean by that. If you are looking to merge R1/R2 reads because they overlap (e.g. size of insert is smaller than length of sequencing) then you need to be using a program like
bbmerge.sh
from BBMap suite or FLASH. If you want to actually assemble the sequences into contigs thenmegahit
would be appropriate.