Question

FastQ files: Double demultiplexing

0

Entering edit mode

3.1 years ago

compuTE ▴ 140

Hello,

Is there a way of demultiplexing Illumina Truseq RNAseq data more than once? For example, demultiplex by sample and another index? (this index being dual-index)

Of course, it's easy to demultiplex the samples using bcl2fastq. But in this case, I end up with a fastq file per sample containing reads with dual indexes that I want to split into different fastq files.

This is pared-end data.

I tried fastq-multx on the fastq files output by bcl2fastq, but most of my reads failed to be assigned to an index. It could very well be the case, but I would like to verify this approach somehow. I would appreciate any software recommendations.

Thank you!

demultiplex bcl2fastq fastq index illumina • 1.4k views

ADD COMMENT • link updated 3.1 years ago by GenoMax 150k • written 3.1 years ago by compuTE ▴ 140

1

Entering edit mode

You will need to explain where this second index is. Is is a standard Illumina second index? Probably not since in this case bcl2fastq would have handled it. Is it inside (in-line) your reads? If latter where is that located?

You may want to try sabre (LINK).

ADD REPLY • link 3.1 years ago by GenoMax 150k

0

Entering edit mode

Hi, thank you so much for your suggestion! The indexes are added at the 3' and 5' of the sequences and are complementary. For example, an index is placed as

ACAACC[-----an RNA molecule here-----]GGTTGT

So I think ACAACC should be at the start in either R1 or R2 and GGTTGT in the other one.

I tried sabre as you suggested and it seemed to have worked - I got 98% of the reads matched to one of my indexes. I used the following command:

sabre pe -m1 -c -f S1_R1.fastq.gz -r S1_R2.fastq.gz -b index_sheet_single.tab -u S1_R1_unknown.fastq.gz -w S1_R2_unknown.fastq.gz

And for index_shee_single.tab I have something like this (from the example):

ACAACC  group_1_R1.fastq    group_1_R2.fastq

I think (and please correct me if I'm wrong because i didn't fully understand how this worked..) -c from sabre looks for the reverse complement of the index in the other read mate.