I had some shotgun sequencing completed on DNA extracted from tissue biopsies with the intention on using them for metagneomic profiling. Each sample is paired end and was split over 4 lanes to maximise output on a NextSeq 500 2x150bp sequencing run.
I ran FastQC on fastq files and they all seem to be good qulity, however I am having issues merging the paired and fastq files. I initially combined all of the R1 and R2 files per sample using cat. I then tried using bbmerge to combine the merged R1 and R2 files as I have done previously, however the percentage of reads merging is only 10-15%, with the vast majority being ambiguous.
I have used this same workflow before on a previous project which was sequenced on a HiSeq using 2x126bp chemistry, and I was able to merge all of the samples with a succesfull rate of roughly 75% of the reads.
Am I doing something wrong? Should I expect differences based on the sequencing chemistry 2x150bp vs 2x126bp or the fact that the latest run was split over 4 lanes an issue?
Edit: This is the insert size info from BBmerge:
Insert range: 35 - 289 90th percentile: 278 75th percentile: 264 50th percentile: 236 25th percentile: 190 10th percentile: 135
Can you merge the individual read pairs and then cat the files? It is possible that if your reads are not in order in merged file then they themselves will not merge properly.
I did try that but I was getting the same merge rate (around 15%)