I spent quite a bit of effort troubleshooting this so I thought of posting it for my own reference and for the community benefit, hopefully.
Starting with MinKNOW (I think), Nanopore sequencing started producing fast5 files compressed with their vbz custom algorithm instead of gzip.
vbz compression means that any software handling the raw signal from these fast5 files will fail with more or less cryptic errors. For example, I got:
h5dump FAQ95459_3d12db00_0.fast5
h5dump error: unable to print data
Or in python:
import h5py
fn = 'FAQ95459_3d12db00_0.fast5'
fin = h5py.File(fn, "r")
a_read = list(fin.keys())[0]
list(fin[a_read]['Raw']['Signal'])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/export/home/db291g/.local/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 664, in __iter__
yield self[i]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/export/home/db291g/.local/lib/python3.7/site-packages/h5py/_hl/dataset.py", line 710, in __getitem__
return self._fast_reader.read(args)
File "h5py/_selector.pyx", line 366, in h5py._selector.Reader.read
OSError: Can't read data (can't open directory: /usr/local/hdf5/lib/plugin)
Or using DNAscent
DNAscent/bin/DNAscent detect --bam aln.bam --reference ref/genome.fasta --index index.dnascent --output out.detect
Loading DNAscent index... ok.
Loading DNAscent index... ok.
Importing reference... ok.
Opening bam file... ok.
Importing reference... ok.
Opening bam file... ok.
Scanning bam file...ok.
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140573004461824:n 0sec failed: 0
#000: H5Dio.c line 173 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 550 in H5D__read(): can't read data
major: Dataset
minor: Read failed
#002: H5Dchunk.c line 1872 in H5D__chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#003: H5Dchunk.c line 2902 in H5D__chunk_lock(): data pipeline read failed
major: Data filters
minor: Filter operation failed
#004: H5Z.c line 1357 in H5Z_pipeline(): required filter 'vbz' is not registered
major: Data filters
minor: Read failed
#005: H5PL.c line 298 in H5PL_load(): search in paths failed
major: Plugin for dynamically loaded library
minor: Can't get value
#006: H5PL.c line 402 in H5PL__find(): can't open directory
major: Plugin for dynamically loaded library
minor: Can't open directory or file
DNAscent: src/event_handling.cpp:643: void normaliseEvents(read&, bool): Assertion `et.n > 0' failed.
Scanning bam file...ok.
HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140141318317824:n 0sec failed: 0
#000: H5Dio.c line 173 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#001: H5Dio.c line 550 in H5D__read(): can't read data
major: Dataset
minor: Read failed
#002: H5Dchunk.c line 1872 in H5D__chunk_read(): unable to read raw data chunk
major: Low-level I/O
minor: Read failed
#003: H5Dchunk.c line 2902 in H5D__chunk_lock(): data pipeline read failed
major: Data filters
minor: Filter operation failed
#004: H5Z.c line 1357 in H5Z_pipeline(): required filter 'vbz' is not registered
major: Data filters
minor: Read failed
#005: H5PL.c line 298 in H5PL_load(): search in paths failed
major: Plugin for dynamically loaded library
minor: Can't get value
#006: H5PL.c line 402 in H5PL__find(): can't open directory
major: Plugin for dynamically loaded library
minor: Can't open directory or file
DNAscent: src/event_handling.cpp:643: void normaliseEvents(read&, bool): Assertion `et.n > 0' failed.
Solution:
You need to install Nanopore's vbz plugin to handle vbz compression. Thankfully, it's available on bioconda:
mamba install ont_vbz_hdf_plugin
NB The first time you install it, you need to deactivate and re-activate the environment for the variable HDF5_PLUGIN_PATH
to be exported. Alternatively, export the variable yourself:
export HDF5_PLUGIN_PATH="${CONDA_PREFIX}/hdf5/lib/plugin/"
You can also convert vbz to gzip with ont_fast5_api, also available on bioconda.
I don't know if it was me, but all this wasn't obvious to me at all!
Thanks you, I have looked long to understand how to run tailfindr, which also uses this VBZ plug-in, but couldn't get it done. Now I hope to make it works.