Question

De-multiplexing Single Index BCL & Fastq reads?

0

Entering edit mode

7.2 years ago

Jon17 ▴ 20

I have a mix of single index and dual index Illumina NextSeq reads mixed with dual index. The program bcl2fastq demultiplexed all of the dual index reads great, but I can't seem to figure out how to demultiplex single index reads. Suggestions / help wanted please! I'm interested in learning how to demultiplex single index from both / either bcl and / or fastq files. So any solution will be appreciated!

Undetermined_S0_L001_R1_001.fastq.gz
Undetermined_S0_L001_R2_001.fastq.gz
Undetermined_S0_L002_R1_001.fastq.gz
Undetermined_S0_L002_R2_001.fastq.gz
Undetermined_S0_L003_R1_001.fastq.gz
Undetermined_S0_L003_R2_001.fastq.gz
Undetermined_S0_L004_R1_001.fastq.gz
Undetermined_S0_L004_R2_001.fastq.gz

This is my SampleSheet.csv. Everything from sample 250 on worked great. Everything from 61-25 not so much

Sample_ID,Sample_Name,Sample_Plate,Sample_Well,I7_Index_ID,index,I5_Index_ID,index2,Sample_Project,Description
61,,,,,CGATGTAT,,NNNNNNNN,,
62,,,,,TGACCAAT,,NNNNNNNN,,
63,,,,,ACAGTGAT,,NNNNNNNN,,
115,,,,,GTGGCCAT,,NNNNNNNN,,
116,,,,,GTTTCGAT,,NNNNNNNN,,
117,,,,,CGTACGAT,,NNNNNNNN,,
166,,,,,ATCACGAT,,NNNNNNNN,,
167,,,,,TTAGGCAT,,NNNNNNNN,,
168,,,,,ACTTGAAT,,NNNNNNNN,,
220,,,,,TAGCTTAT,,NNNNNNNN,,
221,,,,,GGCTACAT,,NNNNNNNN,,
222,,,,,CTTGTAAT,,NNNNNNNN,,
247,,,,,AGTCAAAT,,NNNNNNNN,,
248,,,,,AGTTCCAT,,NNNNNNNN,,
249,,,,,ATGTCAAT,,NNNNNNNN,,
250,,,,,ATTACTCG,,AGGCTATA,,
253,,,,,ATTACTCG,,GCCTCTAT,,
254,,,,,TCCGGAGA,,GCCTCTAT,,
255,,,,,CGCTCATT,,GCCTCTAT,,
251,,,,,TCCGGAGA,,AGGCTATA,,
274,,,,,CGGCTATG,,GCCTCTAT,,
275,,,,,TCCGCGAA,,GCCTCTAT,,
276,,,,,TCTCGCGC,,GCCTCTAT,,
277,,,,,AGCGATAG,,GCCTCTAT,,
278,,,,,ATTACTCG,,AGGATAGG,,

bcl2fastq illumina demultiplex single-index • 3.3k views

ADD COMMENT • link updated 7.2 years ago by h.mon 35k • written 7.2 years ago by Jon17 ▴ 20

score 0 · Answer 1 · 2017-08-30

0

Entering edit mode

7.2 years ago

GenoMax 147k

Look into --use-bases-mask option for bcl2fastq.

To get only those samples that contain the single index you would use --use-bases-mask Y*,I6,n6,Y*. You will have to edit RunInfo.xml file in flowcell folder (make a backup first) and change the line <Read Number="3" NumCycles="6" IsIndexedRead="Y" /> to <Read Number="3" NumCycles="6" IsIndexedRead="N" />. Edit the SampleSheet.csv to remove all 2D samples. When you run bcl2fastq do not forget to specify a new location for output directory.

ADD COMMENT • link 7.2 years ago by GenoMax 147k

0

Entering edit mode

Actually, there is no need to edit RunInfo.xml. You can do everything with --use-bases-mask alone.

Also, do not use Ns as an index.

ADD REPLY • link 7.2 years ago by igor 13k

0

Entering edit mode

Thanks igor, so it should look like this?

61,,,,,CGATGTAT,,,,

ADD REPLY • link 7.2 years ago by Jon17 ▴ 20

0

Entering edit mode

Yes. As long as you left the right number of , that match number of fields in the header.

ADD REPLY • link 7.2 years ago by GenoMax 147k

0

Entering edit mode

2D = dual index? Assumptions are the mother of all screwups....

ADD REPLY • link 7.2 years ago by Jon17 ▴ 20

0

Entering edit mode

That is correct. The 2D part for sure :)

ADD REPLY • link 7.2 years ago by GenoMax 147k