Hi,
I have seen a few times where bcl2fastq
(v2.20) will produce duplicate FASTQ entries in sequencing read IDs, raw sequences, & quality scores. This causes issues with downstreams tools like Picard MarkDuplicates
(e.g. Exception in thread "main" htsjdk.samtools.SAMException: Value was put into PairInfoMap more than once
).
I'm hoping to get some community input for how to troubleshooot this. For me, I did not get this issue when I removed the --no-lane-splitting
option from my command and I checked the FASTQs to verify that one of the two identical reads had been removed (made sure to check across all L00X
reads).
So far, I've also heard verifying the latest bcl2fastq version and re-running bcl2fastq
as options (seqanswers link) - does anyone have any more methods or want to share their experience? Thanks in advance
Interesting, thank you! You are referring to the
--loading-threads
,--processing-threads
, &--writing-threads
options?And do you mind going into more detail about what you mean by balancing?
Correct. What numbers are you currently using? Do you have access to a high performance storage system or are you writing to normal SSD/spinning disks? Try using more
--writing-threads
if you don't have a high performance storage system. Keep the overall number of threads low (adding all three).--loading-threads 12
--processing-threads 24
--writing-threads
: Not Set, but I think this means it is default to 4 according to the bcl2fastq Processing OptionsWe have high performance storage and computing cluster. I will play around with these options and try to keep their numbers low.
You would definitely want writing threads to be larger than the other two by 2 to 4.
Thank you - this worked! Why did you recommend increasing the number of write threads?
To provide adequate writing/processing buffer to make sure changes get written to disk.
If this answered your question then consider marking original answer accepted (green check mark for answer) to provide closure to this thread.
Thank you - I will, but would like to keep it open for another few days, just in case others have other suggestions.