-
Notifications
You must be signed in to change notification settings - Fork 622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault when using 'mixed' #5365
Comments
Hi @usbhub As for the seagfault, we are able to run both versions of the test with GPU on my end (with the fix for Can you test if anything related to GPU in DALI works, and if you can ran any CUDA code in your environment? This is the simplest pipeline that does a copy to GPU:
Can you share what docker image and the environment inside it you use, so we can do a full repro?
|
Thank you for the quick response, That basic pipeline you sent works and prints
The docker I'm testing on is custom, though I tried on the base it was derived from: |
Hi, after digging a bit I've found the issue. I was installing the library decord inside the docker and in order to build it requires the nvidia video codec sdk (build error for reference below). I had downloaded that and put the This is unrelated to this library, but I wonder do you have any thoughts on how to build a library that requires the video codec inside the docker without causing issues? After some googling I saw that you can install these packages: Decord build error:
|
I've found that you need to explicitly give the 'video' capability to get
Refs: I think this can be resolved, thank you for the help. |
Hi @usbhub, Yes, exactly that. Exposing video capability inside docker is a way to go. You can also see |
Version
nvidia-dali-cuda110 1.35.0
Describe the bug.
When running a dali pipeline with device="mixed" or eager with device="gpu" I get a segmentation fault. I'm running on a docker and in a conda env. If I use device="cpu" the code works without issue. I also tried this on a different conda env with cuda 12 with the same result.
Sidenote: the eager "gpu" device is broken as it tries to check for a ._mixed_ops property that doesn't exist anymore I guess:
DALI/dali/python/nvidia/dali/_utils/eager_utils.py
Line 605 in 717d704
I just removed that check and set device = "mixed" for my experiment
Minimum reproducible example
Relevant log output
Other/Misc.
Check for duplicates
The text was updated successfully, but these errors were encountered: