Barcode (index) splitting
1
0
Entering edit mode
10.1 years ago
mbk0asis ▴ 700

Hello!

I have fastq files from HiSeq2500 that all the reads are tagged with our additional custom indexes along with TruSeq indexes. So, my reads sort of look like below.

TruSeq adapter w/ index     NNNN    our index   NNNN    read

After split reads by TruSeq index, I want to split again with our indexes.

Is there a tool that can separate reads by a certain index sequence at certain position?

(i.e. if a sequence from 5th base to 10th base is 'AGTGC', then it is index #1)

I hope my explanation is enough for you. Thank you!

NGS index Illumina barcode split • 6.0k views
ADD COMMENT
0
Entering edit mode

Have you discovered this answer.

Demultiplex Illumina With Barcodes On Identifier Line

is that what you were meaning.

ADD REPLY
0
Entering edit mode
10.1 years ago

Hello!

You can try out the Checkout routine from our MiGEC package, which was specifically designed for such kind of tasks. It can also output those "NNN.." sequences to output read header, in case those are the part of UMI barcode (see this overview article for details).

PS In your case you should provide barcodes.txt file containing

S1 NNNNatgtgcATTGatgcNNNatgc
S2 NNNNatgtgcGATTatgcNNNatgc
S3 NNNNatgtgcTGATatgcNNNatgc
...

where ATTG, GATT, TGAT,... are your barcodes, lower case characters indicate primer sequence (those regions allow mismatches) and NNNN is the degenerate region. IUPAC ambiguous could also be used.

ADD COMMENT
0
Entering edit mode

Thank you, Mikhail!

I think this is exactly what I was looking for.

ADD REPLY

Login before adding your answer.

Traffic: 2572 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6