Hi,
I'm trying to work with some Illumina shotgun metagenomic reads (2x150bp). I've tried merging both the forward and reverse reads with BBmerge and PEAR but both tools only merge about 30% of the reads at the most.
Would I be right in assuming that this is due to the shotgun shearing producing some larger inserts where the forward and reverse reads never actually overlap?
If this is the case, would there be any benefit to merging the reads before DIAMOND analysis, or would just processing Read 1 and Read 2 separately be preferred?
In a protocol for DIAMOND and MEGAN analysis here, the suggest merging paired end reads using fastq-join (which I assume would give similar results to BBmerge and PEAR) and then concatenating the merged reads as well as the unmerged reads together to ensure all of the data is retained.
- What would be the benefit of merging the reads at all if they are just getting combined with the unmerged reads anyway before analysis (other than having a single input file for DIAMOND)?
Thanks,
Ian
Hi ian.petersen
Why don't you assemble the reads into contigs, without worrying too much about merging, and then run the DIAMON/MEGAN pipeline? Having longer sequences you would greatly improve the taxonomic and functional classification