Hi everyone,
Another lab ran a single-end sequencing run on a NextSeq for us, but now they can't properly demultiplex them. I'm trying to see if I can figure it out.
I run bcl2fastq (newest version) on the files, but all reads are dumped to Undetermined_S0_L001_R1_001.fastq.gz
I've got a SampleSheet.csv file that should contain the relevant barcodes, and it is found by bcl2fastq, but clearly isn't working out. In this sample sheet, there's a single field for "AdapterRead1" and then for each sample there's an "Index" and "Index2" field. What I'm trying to figure out is what my reads SHOULD look like. For instance, should they be in the format:
AdapterRead1 + Index(x) + actual sequence
When I manually grep through the samples, I find a significant minority of them do contain the "AdapterRead1" sequence (sometimes at the beginning of the read...sometimes not). Of those, there are some that are also in the format I describe, where one of the Indexes follows AdapterRead1....but most reads don't have AdapterRead1 at all.
When I look in the DemuxSummaryF1L1.txt file, I just see this at the bottom:
### Most Popular Unknown Index Sequences
### Columns: Index_Sequence Hit_Count
unknown 539084000
Since I'm not that familiar with what the sequences should look like, or how this software should behave, I am just not sure how/where to start troubleshooting.
There should be
Stats.json
file found inStat
folder after demultiplexing.Multiqc
parses theStats.json
file and displays top 20 or something indexes found in theundetermined.fastq
files. Hopefully, that will give you some clue.I'm not sure what I'm looking for...but it doesn't look good?
All the other stats in the multiqc file just show that everything is undetermined
That does not make sense. If the run was set up correctly then even if you had the wrong indexes listed in the SampleSheet, the indexes that sequencer sees should show up in this file.
Can you show us what the
RunInfo.xml
file contains for the index set up? An example belowSince your run has two indexes you need to provide a samplesheet that contains both indexes. There is an example here.
I think I do have two?
Here's a piece of the sample sheet:
Can you show a read from
Undetermined
read file?Normally the index should show up in the fastq header (LINK). As you can see in your case it does not. Something is not making sense here. Your
RunInfo.xml
file is showing that the run was indeed set up with indexes but data here says otherwise.I don't think the SampleSheet you show above is working. What version of
bcl2fastq
are you using? Have you looked at the log file for thebcl2fastq
run?