Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error using dynamic compression plugin #1966

Open
planetmarshall opened this issue Sep 18, 2021 · 3 comments
Open

Error using dynamic compression plugin #1966

planetmarshall opened this issue Sep 18, 2021 · 3 comments

Comments

@planetmarshall
Copy link

planetmarshall commented Sep 18, 2021

  • Operating System Ubuntu 20.04
  • Python version Python 3.8.10
  • Where Python was acquired System
  • h5py version 3.3.0
  • HDF5 version 1.12.0

Summary

An error is reported from a custom compression plugin when I attempt to use it to compress a dataset with h5py

Steps to reproduce:

Build the dynamic plugin and set HDF5_PLUGIN_PATH to its location. A minimal plugin is attached.
plugin_test.tar.gz

Run the following:

import h5py
import numpy as np

with h5py.File("test.h5", "w") as h5:
    h5.create_dataset("test", data=np.zeros((8,8), dtype=np.uint16), compression=60000)

Expected Results

The test HDF5 file should be created successfully

Actual Results

The following stack trace is reported:

HDF5-DIAG: Error detected in HDF5 (1.12.0) thread 0:
  #000: ...src/H5Pocpl.c line 1027 in H5Pget_filter_by_id2(): can't find object for ID
    major: Object atom
    minor: Unable to find atom information (already closed?)
  #001: ...src/H5Pint.c line 4015 in H5P_object_verify(): property list is not a member of the class
    major: Property lists
    minor: Unable to register new atom
  #002: ...src/H5Pint.c line 3965 in H5P_isa_class(): not a property list
    major: Invalid arguments to routine
    minor: Inappropriate type
Traceback (most recent call last):
  File "test.py", line 10, in <module>
    h5.create_dataset("test", data=np.zeros((8,8), dtype=np.uint16), compression=32012)
  File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/group.py", line 149, in create_dataset
    dsid = dataset.make_new_dset(group, shape, dtype, data, name, **kwds)
  File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/dataset.py", line 137, in make_new_dset
    dset_id = h5d.create(parent.id, name, tid, sid, dcpl=dcpl)
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5d.pyx", line 87, in h5py.h5d.create
ValueError: Unable to create dataset (error during user callback)

Workaround

h5repack-shared can be used to compress a dataset with the plugin successfully

$ h5repack-shared --enable-error-stack -v -f UD=60000,0 test.h5 test_compressed.h5
No all objects to modify layout
All objects to apply filter are...
 User Defined 60000,0
Making new file ...
-----------------------------------------
 Type     Filter (Compression)     Name
-----------------------------------------
 group                       /
 dset     UD   (1.000:1)     /test

Related

I thought it might be related to HDFGroup/hdf5#1009 as the stack trace is similar, but I believe h5py is linked against the shared library?
Using h5repack also results in the above error whereas h5repack-shared works as expected. Some sort of linking issue is the likely culprit.

@planetmarshall
Copy link
Author

planetmarshall commented Sep 18, 2021

Another workaround

This is fragile so not really a solution, but might work in a containerized environment

  1. Build the plugin by linking to the HDF5 shared library (libhdf5.so.200)
  2. Create a symbolic link from libhdf5.so.200 to the HDF5 shared library packaged with h5py
  3. Set LD_LIBRARY_PATH to the location of the symbolic link

The python code then works as expected.

This suggests there is some difference between the library packaged with h5p and the one I built the plugin against (I used HDF5 1.12.0 but it's possible this is caused by some variation in compiler or build environment). I might try building the plugin under the manylinux 2010 environment.

@planetmarshall
Copy link
Author

So the underlying issue is the same - if the plugin calls back into the HDF5 API (several plugins do in order to determine chunk size and type information) then it needs to be the same instance of the HDF5 library that h5py is using - so plugins installed system wide that call into the API will not work ( I think the documentation should be updated to reflect this - I am happy to make that change once I've finished my work on this plugin ).

I am going to propose a change to the HDF5 plugin mechanism that will mitigate the need to call back into the HDF5 API, but that could take some time as it's not my day job :)

@aragilar
Copy link
Member

@planetmarshall Nice work debugging this. The latest h5py wheel is build with 1.12.1, so there's likely some inconsistency between 1.12.0 and 1.12.1.

I think the best way to document this (on the h5py side) is to note that if you plan on using custom plugins, you should either:
a) get them included in https://github.com/silx-kit/hdf5plugin
or
b) build h5py from source against a user-managed HDF5

It probably also worth adding a note to the docs that, in general, if you're using multiple projects which link to HDF5, building from source rather than using the prebuild wheels is the best method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants