Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hip sharing buffer with DX12 not working #165

Closed
Nielsbishere opened this issue Jun 16, 2023 · 6 comments
Closed

Hip sharing buffer with DX12 not working #165

Nielsbishere opened this issue Jun 16, 2023 · 6 comments

Comments

@Nielsbishere
Copy link

Hi, cpu and cuda support is working fine for me but it seems like HIP is still not working. I'm getting a black output (but the devices do show up and dlls are in correct position). I've tested that oidn2 branch from the example fork linked from a different issue, but that seems to have the same issue (a few commits back). The current commit in that fork seems to crash because it picks on LUID which is not supported in HIP, but the one before that change has the same problem I do. Is that a problem that's fixed in the latest dev version?

As a tip for how to deal with lack of LUID currently: I currently use the UUID to distinguish how many unique GPUs are present in the system. If there's only one gpu then I can simply pick the first one. If there's multiple then I can pick the one that matches the adapter desc name from dx12. If there's multiple with the same name and different uuid, then I can fallback to cpu. I've tested this on different multi gpu setups and seems to work fine; unless two of the same name are installed.

@atafra
Copy link
Collaborator

atafra commented Jun 16, 2023

Hi. From the issue title my understanding is that HIP does not work at all. Is this really the case, that HIP doesn't work under any circumstances? From the description it seems that you're using OIDN with buffer sharing and for that you get a black output. But this does not necessarily mean that HIP does not work at all with the current release. It might also mean that it's just buffer sharing that doesn't work. Could you please check without buffer sharing? By my experience buffer sharing for HIP devices kind of works on Linux (but the UUIDs mismatch so this shouldn't be used in production code!) but I also got a black output on Windows. I think buffer sharing simply doesn't work with HIP on Windows, and this is very likely a driver issue and not a bug in OIDN. HIP is not supported officially on Windows, so I'm not that surprised about this.

You're right that one could work around the lack of LUID, but I really do not recommend it. It has some flaws but most importantly it really hides the fact that something's wrong in the driver. And it is. If LUID is not reported or UUID does not match, it suggests that you really shouldn't try to match the device in some other way because there are issues in the driver. This is why I would highly recommend to avoid such workarounds and always fall back to copying through host memory in such cases. If you would do this in the application, as recommended by the documentation, with current AMD drivers the application would fall back to copying and OIDN would run correctly (but with an overhead).

The bottom line, if this is a driver issue (and I think very likely it is), then it should be fixed in the driver, and meanwhile applications should rely on the recommended fallback mechanism. What you're currently doing is useful as an experiment, as we discussed in a previous issue but there shouldn't any expectation about its correctness because this is an unsupported scenario and thus shouldn't be done in production code.

@Nielsbishere
Copy link
Author

Aha, sounds good. We're on windows and it's indeed just with buffer sharing. I'll hold off on reporting this to our AMD rep as we filled a couple more bugs since the last HIP issue.

@atafra
Copy link
Collaborator

atafra commented Jun 16, 2023

Could you please update the title to make it more specific? Just to avoid any confusion for other users.

@Nielsbishere Nielsbishere changed the title Hip not working with current release Hip sharing buffer with DX12 not working Jun 16, 2023
@atafra
Copy link
Collaborator

atafra commented Nov 1, 2024

DX12 and Vulkan buffer sharing finally works using the latest devel branch and the latest AMD drivers on Windows.

@Nielsbishere
Copy link
Author

Nicee, will test it soon

@Nielsbishere
Copy link
Author

Oh I'm sorry I didn't get back to you but I'm happy to say it seems to be working fine 🎉 if the driver version doesn't support it our backend can now correctly fallback to HIP without LUID support and if the driver does it can share it correctly. Thank you 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants