Question

FASTQ-dump: --split-files: Rejected 5 READS because READLEN < 1

0

Entering edit mode

2.5 years ago

beginner123 • 0

Hi everyone,

I have to perform bulk RNA sequencing (I am new to this). I want to run this: fastq-dump --split-files -X 5 SRR14933197 -Z (which is supposed to give me the first 5 spots/reads). The layout is single (reads). I do get an output, but I also get "Rejected 5 READS because READLEN < 1". I don't really know how to interpret this. From colleagues I heard that --split-files is are only supposed to be used when having paired-end reads, however, my professor used the same line of code to run his SRR which also was a single-read. So I don't really understand why I have this rejection. Can anyone help me?

Thanks in advance!

reads single-end • 2.6k views

ADD COMMENT • link updated 2.5 years ago by GenoMax 151k • written 2.5 years ago by beginner123 • 0

2

Entering edit mode

Ditch the terrible SRA-toolkit, enter that SRR ID over at https://sra-explorer.info/ and get a download link for fastq file directly.

ADD REPLY • link 2.5 years ago by ATpoint 87k

0

Entering edit mode

But it's actually for an assignment, and we are supposed to run it via jupyternotebooks using that line of code fastq-dump --split-files -X 5 SRR14933197 -Z. :/ Does someone know the answer to my problem? If you need specifications do ask please.

ADD REPLY • link 2.5 years ago by beginner123 • 0

score 3 · Accepted Answer · 2022-11-09

In this case the dataset is single end so treating it as such does not generate any errors.

$ fastq-dump -X 5  SRR14933197
Read 5 spots for SRR14933197
Written 5 spots for SRR14933197

But if you were to add the --split-files option then you are getting that spurious error. You can check that by adding an additional option -M 0 (which should keep all reads irrespective of length) which removes that error but generates an additional sequence file (_2 with no sequence). It is this second set of sequences (with 0 length) that are generating the error you see.

$ fastq-dump --split-files -X 5  -M 0 SRR14933197
Read 5 spots for SRR14933197
Written 5 spots for SRR14933197

$ more SRR14933197_2.fastq
@SRR14933197.1 1 length=0

+SRR14933197.1 1 length=0

@SRR14933197.2 2 length=0

+SRR14933197.2 2 length=0

The same 5 reads appear to be dumped out in all methods _1.fastq. So you should be good to use that file.

This behavior may be due to fact that this run seems to show a 0 bp second read in SRA. Which may be a submission error.

read_2