Hi,
I have assembled and binned a small metagenome.Now I want to extract all reads that map to the contigs of each bin and reassemble them and see if the assemblies improve significantly.
For this I have mapped the original reads back to the bins using bowtie2.
I know that bowtie can output all mapped reads in fastq-format which would be perfect for my needs.
I have a mixture of paired and unpaired reads after adapter-trimming and quality clipping and I want to keep the paired-end information when possible. The problem is, when running bowtie with paired-end settings, I only have the option to output the concordantly aligning reads (argument --al-conc
). This does not include paired reads that align to separate contigs. As it may be that such unconcordantly aligning pairs may connect different contigs to possible scaffolds, I would like to include these pairs as well. Using the option --un-conc
, however, produces all other pairs (those which align unconcordantly and those which don't align at all).
When running bowtie with unpaired reads it is quite straightforward to output all aligned single reads (just add the --al
argument), but then I would have to rather painfully reconstruct the read pairs and the orphaned reads from that output.
Is there any elegant way to solve this problem?