Question

Pad paired-end FASTQ files for HISAT2 error 6 (ABRT)?

0

Entering edit mode

3.6 years ago

Petur ▴ 20

Good evening, chaps. I'm analysing 12 paired-end bulk RNA-seq samples coming from the SRA and when it comes to mapping those reads to a reference genome, HISAT2 is complaining about foo_R1.fastq and foo_R2_.fastq having an unequal number of reads. To be precise, the whole error message is:

Error, fewer reads in file specified with -2 than in file specified with -1
terminate called after throwing an instance of 'int'
(ERR): hisat2-align died with signal 6 (ABRT)

I know for a fact that the pipeline is properly working up until this step, so modifying the previous steps is out of the question.

I heard somewhere that you can pad your files so that you can have files with equal number of reads, but I can't find the source. How can I do so so that I can successfully align those reads to the aforementioned reference genome?

Thanks in advance.

hisat2 mapping align fastq • 1.4k views

ADD COMMENT • link updated 3.6 years ago by ATpoint 88k • written 3.6 years ago by Petur ▴ 20

1

Entering edit mode

I know for a fact that the pipeline is properly working up until this step, so modifying the previous steps is out of the question.

The error indicats that exactly this is not the case. Did you trim these sequences? The rookie mistake here is to use a trimmer that is not pairedend-aware so it kicks out a read from one but not from the other file. Or the files are already corrupt from NCBI but that is very unlikely, but not impossible. Please describe your pipeline with relevant code.

ADD REPLY • link 3.6 years ago by ATpoint 88k

0

Entering edit mode

The error indicats that exactly this is not the case ;-)

There's no need to be smug about it, specially since you don't know how noisy/badly sequenced the reads are (which really are).

Yes, I preprocessed the reads with a paired-end-aware trimmer, and already found the answer elsewhere.

For the record, the answer was treating the reads with fastq_pair.

This topic can now be closed.

ADD REPLY • link 3.6 years ago by Petur ▴ 20

score 1 · Accepted Answer · 2021-12-19

1

Entering edit mode

3.6 years ago

Petur ▴ 20

For future references, the answer was padding the sequences with fastq pair

ADD COMMENT • link 3.6 years ago by Petur ▴ 20