Read subgroup of HDF5 file with (0,0) group name?
1
0
Entering edit mode
4.4 years ago
star ▴ 350

I have an hdf5 file that I would like to read the subgroup of it.

the file 'GSE77565_FBD_IC-heatmap-chr-10k.hdf5.gz' is downloaded from GEO number= 'GSE77565'

I read the data like below:

library(rhdf5)

h5f = H5Fopen("file.hdf5")
h5f 
HDF5 FILE 
        name /
    filename 

        name       otype dclass           dim
0  (0, 0)    H5I_DATASET FLOAT  24926 x 24926
1  (1, 1)    H5I_DATASET FLOAT  24320 x 24320
2  (10, 10)  H5I_DATASET FLOAT  13501 x 13501

but when I want to read one of the name I faced wth error:

h5f$"(1, 1)"
Error: Unable to read dataset.
Not all required filters available.
Missing filters: lzf

Also, I could not open any subgroup:

h5f_1 = H5Fopen("file.hdf5","(1, 1)")
Error: Unable to read dataset.
Not all required filters available.
 Missing filters: lzf
In addition: Warning message:
In h5checktypeOrOpenLoc(file, readonly = TRUE, native = native) :
  An open HDF5 file handle exists. If the file has changed on disk meanwhile, the function may not work properly. Run 'h5closeAll()' to close all open HDF5 object handles.

h5f&"(1, 1)"
HDF5 DATASET 
    name /(1, 1)
filename 
    type H5T_IEEE_F64LE
    rank 2
    size 24320 x 24320
 maxsize 24320 x 24320
hdf5 rhdf5 dataframe python pandas • 1.5k views
ADD COMMENT
1
Entering edit mode
3.8 years ago
Mike Smith ★ 2.1k

Better late than never, rhdf5 can now read HDF5 files where the dataset has been compressed with the LZF filter. I'd also recommend using the h5read() function, although the examples used above should also work fine.

Using rhdf5 version 2.34.0 we get the error above:

h5read(file = "GSE77565_FBD_IC-heatmap-chr-40k.hdf5", 
           name = "(0, 0)", 
           index = list(101:105, 101:105))
# Error: Unable to read dataset.
# Not all required filters available.
# Missing filters: lzf

With the latest version of rhdf5 it now works:

h5version()
# This is Bioconductor rhdf5 2.35.2 linking to C-library HDF5 1.10.7 and rhdf5filters 1.3.3

h5read(file = "GSE77565_FBD_IC-heatmap-chr-40k.hdf5", 
       name = "(0, 0)", 
       index = list(101:105, 101:105))

#         [,1]     [,2]     [,3]     [,4]     [,5]
#[1,]   0.0000   0.0000 263.9241 222.3245 187.9788
#[2,]   0.0000   0.0000   0.0000 300.8759 198.2203
#[3,] 263.9241   0.0000   0.0000   0.0000 292.3435
#[4,] 222.3245 300.8759   0.0000   0.0000   0.0000
#[5,] 187.9788 198.2203 292.3435   0.0000   0.0000
ADD COMMENT

Login before adding your answer.

Traffic: 2611 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6