I have the merged paired end reads and the rest of the separate paired end reads in two different files. How can I give this as input to Spades for assembling?
I have the merged paired end reads and the rest of the separate paired end reads in two different files. How can I give this as input to Spades for assembling?
It is unclear whether you have reads that are interlaced or joined. If you have joined the reads together because they overlap, then these new merged reads can be specfied as a single-end reads along with the other paired-end reads:
spades.py -1 read1.fq -2 read2.fq -s merged.fq -o spades_test
However, if you mean that you have a set of reads that are in the interlaced format and a set that is in the paired-end format:
spades.py -1 read1.fq -2 read2.fq --12 merged.fq -o spades_test
You can read Spades definition of interlaced here: http://spades.bioinf.spbau.ru/release3.10.1/manual.html#sec3.2
Thank you, but still I am confused. Actually, I am working with the tool , BBsplit. I want to split the reads I have according to the reference genome to which it maps. So, after i run bbsplit, if we used paired end reads, the tool gives an output in such a way that, the paired end reads that map to a particular reference genome is put together and the ones which did not map are given as seperate output. For example :
command :bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%.fq outu1=clean1.fq outu2=clean2.fq output : out_ecoli.fq, out_salmonelaa.fq, clean1.fq and clean2.fq. , where the out_ecoli and out_salmonella are paired end reads that mapped to reference genome of ecoli and salmonella and clean1 and clean2 are forward and reverse reads which did not map to any reference genome. And bbsplit manual says that BBSplit is a tool that bins reads by mapping to multiple references simultaneously, using BBMap. The reads go to the bin of the reference they map to best. There are also disambiguation options, such that reads that map to multiple references can be binned with all of them, none of them, one of them, or put in a special "ambiguous" file for each of them. Paired reads will always be kept together.
So the out_ecoli or out_salmonella, are they paired end reads which are interlaced or are they joined because they overlap?
manjumoorthy95 : You obtained interleaved reads in out_ecoli and out_salmonella
files because of the way your specified your output in bbsplit.sh
command. You can easily de-interleave the reads by doing reformat.sh in=out_ecoli.fq out1=ecoli_R1.fq out2=ecoli_R2.fq
.
You could also run your original bbsplit.sh
command like this bbsplit.sh in1=reads1.fq in2=reads2.fq ref=ecoli.fa,salmonella.fa basename=out_%_#.fq outu1=clean1.fq outu2=clean2.fq
to get R1/R2 reads as separate files.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Have you checked
SPAdes
manual? This information is in the input section.I think you have an option like --merged but I don't remember. As said you should check the manual and you would propably find it because you can literally give any kind of data to spades.
Ok Thank you sir. Will check
So after giving " --merged name_of_merged_file" , can I give my rest of the separate paired reads which is not merged as," -1 read1.fq -2 read2.fq "