ONT direct RNA sequencing
2
0
Entering edit mode
8 months ago
Karina • 0

I have ONT direct RNA sequencing data consisting of 958 fastq.gz files per sample. I need to merge all of these files into a single file. When attempting to combine them using the command zcat PAW*.fastq.gz > fld.fastq.gz, I encountered an error indicating that the files are not in gzip format. However, the files were basecalled by the ONT machine and have the .gz extension.

How can I resolve this issue and ensure that the files are in the correct gzip format? I tried to unzip them using gunzip but again error says the same

DRS ONT • 411 views
ADD COMMENT
0
Entering edit mode
8 months ago
dthorbur ★ 2.5k

You can check the validity of gzipped files using gzip -v -t file.gz.

Should look something like this:

$ gzip -v -t test_file.gz
test_file.gz:        OK

If they fail, you can check if they are uncompressed with head file.gz, and it is human readable then it's not compressed. If it's not readable, then maybe it's not compressed with gzip but another program.

ADD COMMENT
0
Entering edit mode
7 months ago
noodle ▴ 590

You should use cat on .gz files, part of the magic of .gz ...I guess if you use zcat they are unzipped and this is why you get the error - they have the right extension but they are not gzipped. Proof: if you enter head fld.fastq.gz does it return readable text or gibberish?

cat PAW*.fastq.gz > fld.fastq.gz
ADD COMMENT

Login before adding your answer.

Traffic: 1597 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6