We run multiple samples for sequencing on an Illumina NovaSeq machine. After converting the files to fastq format using bcl2fastq
, we can see that we have some trouble with index hopping.
The image attached here shows the structure of the indices, how they are supposed to be and how we can see them after the conversion.
The color coding at the top of the image shows the four samples in question (names are in the first column to the left). The right column explains how the barcodes were supposed to be paired together. The Top Unknown Barcodes at the bottom part of the image shows how they were found by the conversion tool.
Interestingly, the two samples 6-2 and 8-2 show the highest number of reads in the complete data set (contains 30 samples) with around 20M reads, while the two samples 1-1 and 3-1 are both at the bottom of the list with the lowest number of assigned reads.
My question is whether these two results are connected. As far as I understand, if the two barcodes are not identified, the read is automatically classified as Unknown. But is it possible that somehow reads from e.g sample 3.1 were assigned to sample 6-2 by mistake, or reads from sample 1-1 were saved under sample 8-2?
To me it seems to be too much of a coincidence to see the two samples with the highest and lowest number of reads being all connected in the barcode swapping event.
Any advice would be appreciated.
cross-posted here, but got no response
What is the % of these index hopped reads compared to demultiplexed data? Was this run borderline overloaded?
If you are able to do it then I suggest that you demultiplex the data using Illumina
bcl-convert
instead ofbcl2fastq
.. It produces an explicit report for index hopping.I will try
bcl-convert
if I can just find out how to install it on ubuntu :-)I manage to install the tool. When running it I get the following warning:
and after the run is finish the file
Index_Hopping_Counts.csv
is empty.Any idea what it is or how to change it, if possible?
Looks like your i7 indexes are not diverse enough. If you had allowed 1 error in indexes during demultiplexing try using only perfect matches with setting below in your samplesheet.
Yes, I tried this as well. The warning still shows up and the file is still empty