Question

Demultiplex fastq files

0

Entering edit mode

2.2 years ago

Валентин • 0

Hello!

I have a problem, we have done a PE run which was not demultiplexed automatically. I can see barcode sequences at the end of reverse reads, so it should be possible. Is there any tools to make this operation? I can make my own script, but it will be very ineffective and slow (my fastqs are 70+ gb). Thanks in advance for any help!

fastq NGS demultiplex • 2.3k views

ADD COMMENT • link updated 2.2 years ago by swbarnes2 14k • written 2.2 years ago by Валентин • 0

1

Entering edit mode

At best you ask the sequencing provider as they do this routinely and should take them little to no effort. If that is not possible please show an example of the data, so one can understand where the barcoes are. Are they in the header?Is it a single "unassigned" fastq file? There have been quite some posts here on that, please also use the search function. Demultiplexing fastq.gz files

ADD REPLY • link 2.2 years ago by ATpoint 85k

0

Entering edit mode

Thank you for the reply, seems like I found partial solution with FastX toolkit barcode split, but only for the reverse reads which contains barcodes. Is there an easy way to extract forward reads from big fastq that are paired with properly demultiplexed reverse reads?

ADD REPLY • link 2.2 years ago by Валентин • 0

0

Entering edit mode

What application is this for? it's possible that the people who designed the protocol also made a demultiplexer too.

ADD REPLY • link 2.2 years ago by swbarnes2 14k

score 0 · Answer 1 · 2022-10-03

0

Entering edit mode

2.2 years ago

Istvan Albert 102k

Look at tools like fastp, cutadapt and similar ones.

These are more performant and probably offer more features some may provide exactly what you need.

ADD COMMENT • link 2.2 years ago by Istvan Albert 102k

0

Entering edit mode

Are these able to demultiplex data based on internal barcodes that are part of reads?

ADD REPLY • link 2.2 years ago by GenoMax 147k

0

Entering edit mode

Depends on the exact use case, for example, cutadapt has demultiplexing, though I haven't needed to use demultiplexing myself for quite a while now

https://cutadapt.readthedocs.io/en/stable/guide.html#demultiplexing

fastp has something called filter by index, but I only recall that approximately, there are many other tools actually out there

ADD REPLY • link 2.2 years ago by Istvan Albert 102k

score 0 · Answer 2 · 2022-10-03

0

Entering edit mode

2.2 years ago

GenoMax 147k

I can see barcode sequences at the end of reverse reads,

That means you are using barcodes that are integral part of the sequence reads. If that is the case then you need something like sabre: https://github.com/najoshi/sabre

ADD COMMENT • link 2.2 years ago by GenoMax 147k