Hello,
I am currently facing a problem regarding demultiplexing my dual indexed reads run on Hiseq 2000. Basically, the Lane summary looks like this:
So, each lane, having around 230 million reads. Sequencing went fine. I have put 3 samples per lane, to have around 70 million reads per sample. But I get almost half of that. I noticed, the undetermined reads are quite high. For example,
Above is the statistics of undetermined reads. They are covering almost 30,40 sometimes 60% of the flow cell. Basically, we did not allow any mistmatch while running the bcl2fastq to generate fastq files. But in those undetermined reads, I have many cases where the i7 index is fine, but the i5 index has one or two mismatches. An example is below:
I used illumina CD indices for making the libraries(a.k.a. HT index, i7: D701,702,703...i5: D501,502,503 etc).Basically, in this case, I do not care about the i5 indices as they are the same for all 3 samples per lane. For example, Sample 1: D701-D502, Sample 2: D702-D502, Sample 3: D703-D502. This is how I ran lane 1. Similar also to Lane 2, 3 and so on... Therefore, my question is the following:
Is it possible to run bcl2fastq that will demultiplex bcl file based on only index 1(i7), although the sample were prepared with dual indexing? Or is there any better way to do demultiplexing in this situation to get more reads?
I would really appreciate if anyone can help.
This is an excellent example of why one should never use the same index for all samples (either in first or second location).