Trimming Barcodes/Index Reads from FASTQ Read Headers before Alignment

0

Entering edit mode

8.5 years ago

adityabandla ▴ 30

Is it critical to trim the barcodes (or index reads), present in read headers before aligning sequences to the NR database using a tool such as DIAMOND?

Dataset consists of dual-indexed paired-end reads generated on the HiSeq

fastq barcodes hiseq • 2.5k views

ADD COMMENT • link 8.5 years ago by adityabandla ▴ 30

0

Entering edit mode

No. You would need to convert the fastq sequences to fasta format (in strict fasta format anything after the first space in fasta header is ignored).

PE reads should basically give you the same result (unless you have a fusion or something unusual) so searching with only one of the reads should be adequate.

ADD REPLY • link 8.5 years ago by GenoMax 147k

0

Entering edit mode

Hi Genomax

Thank you for the reply! Much appreciated. DIAMOND seems to accept FASTQ files as input as well (https://github.com/bbuchfink/diamond)

So I am just trimming adapters and adjusting the read headers before running them through DIAMOND

ADD REPLY • link 8.5 years ago by adityabandla ▴ 30

0

Entering edit mode

Ah well in that case you possibly don't need to worry about the tag sequence in fastq header (I assume your data is already de-multiplexed).

ADD REPLY • link 8.5 years ago by GenoMax 147k

Login before adding your answer.