How to assemble mulitple paired-end files?
1
0
Entering edit mode
4.4 years ago
A_heath ▴ 170

Hi all,

I downloaded multiple paired-end reads from the SRA (NCBI) and I want to assemble two paired-end read files with each other. All the files look like this:

SRRXXX_1.fastq

SRRXXX_2.fastq

SRRYYY_1.fastq

SRRYYY_2.fastq

SRRWWW_1.fastq

SRRWWW_2.fastq

First, I use Megahit to do so, is that appropriate for what I want to do?

Then, I tried this:

for file in *.fastq;

do

f=$(basename $file)

megahit -1 "$file"_1.fastq -2 "$file"_2.fastq -o "$file"_assembled.fasta/;

done;

But it didn't work because the file names aren't correct and it makes sense because _1.fastq or _2.fastq are added to the original file names. Unfortunately, I don't know how to proceed differently ...

How could I assemble both paired-end read files at the same time and do this for all files?

Thank you for your help, it will be greatly appreciated!

paired-end Assembly • 2.0k views
ADD COMMENT
0
Entering edit mode

I want to assemble two paired-end read files with each other.

I don't know what exactly you mean by that. If you are looking to merge R1/R2 reads because they overlap (e.g. size of insert is smaller than length of sequencing) then you need to be using a program like bbmerge.sh from BBMap suite or FLASH. If you want to actually assemble the sequences into contigs then megahit would be appropriate.

ADD REPLY
6
Entering edit mode
4.4 years ago
Assa Yeroslaviz ★ 1.9k

this should work

for file in *_1.fastq;

do

f=$(echo $file | sed -E "s/\_.*//");

megahit -1 "$f"_1.fastq -2 "$f"_2.fastq -o "$file"_assembled.fasta/; 
done
ADD COMMENT
0
Entering edit mode

Yes!! Perfect, it works great! Thank you so much for your help Assa Yeroslaviz.

ADD REPLY

Login before adding your answer.

Traffic: 2016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6