-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault or free(): invalid pointer when importing dgl with other libraries due to RTLD_GLOBAL #2255
Comments
bump this |
Does it immediately crash after importing DGL after importing the said library? |
Hi @BarclayII, If I import dgl then import the private library, it doesn't crash right away but crashes when there's a call to any dgl functions or the library's functions. On the other hand if I import the library first and then dgl it crashes right away. |
This issue was references in #2328, but then the line was crossed out. I didn't see an immediate reason of why in the issue. If this is a longer term item, could we introduce an env variable to dynamically change the CDLL load in specific circumstance-- perhaps |
As I mentioned in the crossed-out text, directly changing I wasn't able to figure out the reason yet, so I had to work around it by ensuring PyTorch/MXNet/Tensorflow C library to be loaded before |
Ah got it. In the meantime would take a PR that allows us to alter this setting via env variable? |
@BarclayII Thanks! I really appreciate that, we will evaluate the nightly builds ASAP. |
So far no issues as per our experience. Please reopen the issue if the problem still exists in your case. |
🐛 Bug
importing dgl after importing C++ based library with pybind interface leads to segfault or
free(): invalid pointer
. The C++ library in question is an internal library that is not available publicly. I found some relevant issues on pytorch repo pytorch/pytorch#3059 and RobotLocomotion/drake#12073. I was able to find a workaround by deletingctypes.RTLD_GLOBAL
here in the dgl source code. Pytorch and tensorflow seemed to move away from RTLD_GLOBAL. ref (pytorch/pytorch#28536). Just wondering if something similar can be done in dgl.To Reproduce
Sorry the library I'm using that causes this error is not available publicly and uses TBB allocator.
Steps to reproduce the behavior:
Expected behavior
import dgl without segfault or
free(): invalid pointer Aborted
Environment
conda
,pip
, source): condaAdditional context
The text was updated successfully, but these errors were encountered: