Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raytracing support #427

Open
teaglin opened this issue Dec 26, 2018 · 48 comments
Open

Raytracing support #427

teaglin opened this issue Dec 26, 2018 · 48 comments

Comments

@teaglin
Copy link

teaglin commented Dec 26, 2018

Hi, I was curious if there is any support for ray tracing. I am interested in prototyping raytracing applications on macOS with Metal 2 and then scaling them up for production on Nvidia's RTX GPUs.

@cdavis5e
Copy link
Collaborator

Theoretically, it should be possible to support the VK_NV_ray_tracing extension on Mac, using the Metal Performance Shaders framework which uses a very similar technique to accelerate ray tracing. At this time, though, ray tracing isn't supported.

@billhollings
Copy link
Contributor

If anyone wants to take a run at implementing this...it would be welcome.

@teaglin
Copy link
Author

teaglin commented Jan 3, 2019

I might be interested, but I probably won't be free for a few weeks to a month for other priorities.

@Lichtso
Copy link
Contributor

Lichtso commented Apr 19, 2020

Why not go straight for the newest (vendor independent) VK_KHR_ray_tracing extension?
https://www.khronos.org/news/press/khronos-group-releases-vulkan-ray-tracing

@billhollings
Copy link
Contributor

Why not go straight for the newest (vendor independent) VK_KHR_ray_tracing extension?

This is an older thread. But yes...we should now aim for the KHR version.

@Lichtso
Copy link
Contributor

Lichtso commented Apr 21, 2020

One of the dependencies of VK_KHR_ray_tracing is VK_KHR_buffer_device_address. Which requires the host to know the physical addresses (pointers) inside the device. However, it seems that Metal does not expose that information to the user.

@oscarbg
Copy link

oscarbg commented Nov 10, 2020

Came here searching for VK_KHR_buffer_device_address references, as is a recommended extension for current VKD3D Proton 2.0..
let me point out that now raytracing on Metal (on iOS 14.0 and Mac 11.0) seems more similar to VK_KHR_ray_tracing having added support for raytracing shaders (intersection, etc..) (i.e. integrated into Metal shading language)..
seems this new support requires on Mac:
*RDNA GPU or later (no Vega or earlier support)
*Intel Skylake GPU or later
and on IOS: A13 or later..
recalling from memory so might be wrong..

@dictoon
Copy link

dictoon commented Feb 9, 2021

Any news on this?

@oscarbg
Copy link

oscarbg commented Feb 10, 2021

@krupitskas so you will work on spirv cross raytracing metal support?
also may be a good idea to look at https://github.com/ConfettiFX/The-Forge crossplatform raytracing support they support DXR Metal and Vulkan Rt.

@oscarbg
Copy link

oscarbg commented May 12, 2021

seeing AMD RDNA2 6xxx GPUs now supported on MacOS 11.4 betas.. would be nice if someone can test if Metal Performance Shaders Raytracing demos get some "huge" speedups on these GPUs:

https://developer.apple.com/documentation/metalperformanceshaders/metal_for_accelerating_ray_tracing
https://developer.apple.com/documentation/metalperformanceshaders/animating_and_denoising_a_raytraced_scene

this may indicate it's being mapped to new RDNA2 raytracing ISA..

anyway WWDC 21 is near and we (hope)surely will get some update on raytracing on Metal on MacOS..

@billhollings
Copy link
Contributor

NVIDIA Vulkan sample KhronosGroup/Vulkan-Samples#274 both include use of Ray Queries

@kode54
Copy link

kode54 commented Sep 24, 2021

Metal Performance Shaders Raytracing is also available on Apple Silicon, with the added benefit of being able to run in a single rendering pass.

@zmarlon
Copy link

zmarlon commented Jul 6, 2022

Since buffer device address now available in Metal 3, VK_KHR_buffer_device_address has been included in the latest version of MoltenVK. Now you should actually be able to implement VK_KHR_ray_tracing? Is there any information in this direction?

@cdavis5e
Copy link
Collaborator

cdavis5e commented Jul 6, 2022

Since buffer device address now available in Metal 3, VK_KHR_buffer_device_address has been included in the latest version of MoltenVK. Now you should actually be able to implement VK_KHR_ray_tracing? Is there any information in this direction?

I have plans to work on this over the next six months. (Full disclosure: I'm being paid to do this.)

@natevm
Copy link

natevm commented Sep 9, 2022

@cdavis5e I'm working on a Vulkan ray tracing library which uses most all features of Vulkan RT (SBT record data, AABB trees, etc). I'd be very interested to build off your work if I can. :)

@fknfilewalker
Copy link

What is the status on this?

@LeeTeng2001
Copy link

Also curious about the status

@mannewalis
Copy link

Any updates on the status of ray tracing support on Mac with MoltenVK?

@natevm
Copy link

natevm commented Jun 7, 2023

Since buffer device address now available in Metal 3, VK_KHR_buffer_device_address has been included in the latest version of MoltenVK. Now you should actually be able to implement VK_KHR_ray_tracing? Is there any information in this direction?

I have plans to work on this over the next six months. (Full disclosure: I'm being paid to do this.)

Seeing as it’s been nearly a year since this comment was made, it might be worth hunting for another developer to work on this.

@AntarticCoder
Copy link
Contributor

AntarticCoder commented Jun 15, 2023

To implement raytracing, we need the dependencyVK_KHR_acceleration_structure first, and that extension needs VK_KHR_deferred_host_operations. If we were to implement acceleration structures, we could also implement VK_KHR_ray_query, so that could be some what easier.

I want to try a hand at implementing it, but I have no idea where to start, could someone give me some pointers. Thanks

Edit: I meant VK_KHR_deferred_host_operations not VK_KHR_deferred_operations

@natevm
Copy link

natevm commented Jun 15, 2023

That would be a great start. VK_KHR_ray_query would also let you test things out without thinking about shader binding table details right away. Ultimately we'd want both queries and the full pipeline for portability reasons, but it's a good idea to break it all down.

I've never actually heard of VK_KHR_deferred_operations before. I see it's mentioned as VK_KHR_deferred_operation in some Intel RT slides, but other than that it's a bit of a dead end... Is this related to VK_KHR_deferred_host_operations? I seem to get more results for that.

https://registry.khronos.org/vulkan/specs/1.3-extensions/man/html/VK_KHR_deferred_host_operations.html
I've talked to Josh Barczak from that man page before on twitter, he's nice and iirc open to DMs. You might reach out to him @JoshuaBarczak

@AntarticCoder
Copy link
Contributor

AntarticCoder commented Jun 15, 2023

Thanks for the info, I'll try to contact @JoshuaBarczak.

@cdavis5e
Copy link
Collaborator

@gpx1000 ^^

@devshgraphicsprogramming

To implement raytracing, we need the dependencyVK_KHR_acceleration_structure first, and that extension needs VK_KHR_deferred_host_operations. If we were to implement acceleration structures, we could also implement VK_KHR_ray_query, so that could be some what easier.

I want to try a hand at implementing it, but I have no idea where to start, could someone give me some pointers. Thanks

Edit: I meant VK_KHR_deferred_host_operations not VK_KHR_deferred_operations

You'd need to implement KHR_buffer_device_address first, at least the non-shader side of it.

@devshgraphicsprogramming

Since buffer device address now available in Metal 3, VK_KHR_buffer_device_address has been included in the latest version of MoltenVK. Now you should actually be able to implement VK_KHR_ray_tracing? Is there any information in this direction?

I have plans to work on this over the next six months. (Full disclosure: I'm being paid to do this.)

Seeing as it’s been nearly a year since this comment was made, it might be worth hunting for another developer to work on this.

I've been offering to implement since 2021, but seems there's no funding available.

@AntarticCoder
Copy link
Contributor

@devshgraphicsprogramming

I believe VK_KHR_buffer_address has been implemented already according to MoltenVK/Layers/MVKExtensions.def on the line

MVK_EXTENSION(KHR_buffer_device_address, KHR_BUFFER_DEVICE_ADDRESS, DEVICE, 13.0, 16.0)

@natevm
Copy link

natevm commented Jun 16, 2023

@AntarticCoder is VK_KHR_deferred_host_operations actually required for VK_KHR_acceleration_structure?

I noticed that if I remove VK_KHR_deferred_host_operations, that my vulkan ray tracing application still works. I'm only using the device-side acceleration structure builds though, ie vkCmdBuildAccelerationStructuresKHR rather than vkBuildAccelerationStructuresKHR. But that's probably the most common usage.

vkBuildAccelerationStructuresKHR does consume a VkDeferredOperationKHR for the host-side accel build, but perhaps so long as no host commands are actually deferred, then perhaps this extension isn't required for VK_KHR_acceleration_structure?

Edit: ah, nvm, I get validation errors if I remove that extension.

@AntarticCoder
Copy link
Contributor

AntarticCoder commented Jun 16, 2023

@natevm You seem to be right, but would that not be going off the specification? Or are we allowed to deviate from the specification like this?

Some applications may use deferred operations as well as that, especially if they're building on the host, so that would also break some compatabillity, I believe.

@natevm
Copy link

natevm commented Jun 16, 2023

@AntarticCoder if it violates spec, then it's likely required to be supported.

The extension seems simple enough itself. Would require defining an opaque VkDeferredOperationKHR handle which I believe is allocated by a vkCreateDeferredOperationKHR. Then, if this type is passed to a deferrable command (eg vkBuildAccelerationStructuresKHR) then that command might return VK_OPERATION_DEFERRED_KHR rather than the normal VK_SUCCESS or etc.

From there, the command can be executed using vkDeferredOperationJoinKHR. There's a vkGetDeferredOperationResultKHR, a vkDestroyDeferredOperationKHR, and then finally a vkGetDeferredOperationMaxConcurrencyKHR which could iiuc could just return the number of available CPU cores perhaps? Seems to depend on the operation passed in to be deferred...

Might be worth making a separate issue on this, then indicating that this issue depends on that one.

@AntarticCoder
Copy link
Contributor

Sure, I'll create an issue.

@rcaridade145
Copy link

A related issue from spirv-cross KhronosGroup/SPIRV-Cross#2115

@Try
Copy link

Try commented Jul 3, 2023

A related issue from spirv-cross KhronosGroup/SPIRV-Cross#2115

One more issue to note: spirv-cross also have no support for cullMask, for MSL backed. Also I'm not aware of any good way to implement cullMask in MSL ray-query workflow.

@devshgraphicsprogramming

A related issue from spirv-cross KhronosGroup/SPIRV-Cross#2115

One more issue to note: spirv-cross also have no support for cullMask, for MSL backed. Also I'm not aware of any good way to implement cullMask in MSL ray-query workflow.

slap more limits into KHR_portability_subset , like that mask always must be 0xFF

@AntarticCoder
Copy link
Contributor

Opened a PR: #1967

@oscarbg
Copy link

oscarbg commented Jul 14, 2023

@AntarticCoder nice work.. seems next step will be implementing VK_KHR_pipeline_library?, before tackling the main raytracing extension?

@AntarticCoder
Copy link
Contributor

Thanks @oscarbg. Not necessarily, only raytracing pipelines need that. Ray queries are probaly going to be implemented first due to their simpler nature.

@FunMiles
Copy link

With the M3 chips now supporting hardware accelerated Ray Tracing, this issue is becoming more pressing. Is there any current work being done on this?

@gpx1000
Copy link

gpx1000 commented Dec 26, 2023

We have nothing to announce just yet.

@loopervfx
Copy link

Any update? it's coming up on a year. even just ray queries (inline raytracing) would be great for countless auxiliary use cases that we have been prototyping even and especially with underpowered hardware. Current limitations on the target platform makes it a difficult proposition. Now the Apple M4 chips have been announced with even better raytracing performance! Exciting times. Thanks much for all your hard work, happy Halloween 🎃

@cdavis5e
Copy link
Collaborator

I know about as much as you do. Honestly, at this point, I don't think the contract I was supposed to be hired for to implement this is ever going to go through. I'm not even sure if Holochip still exists, or if @gpx1000 is even still alive--last I heard from him, he had his cancer in remission, but...

@gpx1000
Copy link

gpx1000 commented Oct 30, 2024

Hi, we're still here. The contract to implement this doesn't look like it's going to materialize; so it's standing in wait of finding new funding. Nothing else new to report.

@natevm
Copy link

natevm commented Oct 30, 2024

If I understand it right, the current limiting issue is that Metal’s BLAS acceleration structures cannot be easily referenced through VkDeviceAddresses, which are required in the TLAS primitives for both ray queries as well as the ray tracing pipeline.

We have nothing to announce just yet.

@gpx1000 this doesn’t really inspire much confidence…

edit: ah, makes sense if funding is not available.

double edit: 😵‍💫 oof, I didnt know about the personal health situation. That’s definitely more important.

Is it possible for MoltenVK to officially assign a funded developer to look into at least ray query support?

@gpx1000
Copy link

gpx1000 commented Oct 30, 2024

I don't have anything I can provide beyond saying that there is currently no customer, nor funding for this for us so there's no one I can assign out of Holochip's developers. Unless that situation changes; not much I can say other than sorry I can't do more.

And thanks; with luck I will get to cancer cured at the 5 year mark, for now I'm still cancer free.

@natevm
Copy link

natevm commented Oct 30, 2024

One thing I’ve been watching, DX12 recently announced they would be dropping DXIR in favor of SPIR-V. That will affect Apple’s game porting toolkit…

That might mean more funding goes into tooling for SPIR-V on Mac.

There’s also an effort on supporting Metal through NVIDIA’s Slang compiler, and their wrapper around the different host-side API (formerly called GFX, I believe that naming is changing though). I don’t think ray tracing is supported there just yet, but it’s something I know they’re thinking about, and might have the resources to support.

@gpx1000
Copy link

gpx1000 commented Oct 30, 2024

I would suggest watching this effort as well: https://gitlab.freedesktop.org/mesa/mesa/-/issues/11990 It is an exciting direction to monitor.

@devshgraphicsprogramming

If I understand it right, the current limiting issue is that Metal’s BLAS acceleration structures cannot be easily referenced through VkDeviceAddresses, which are required in the TLAS primitives for both ray queries as well as the ray tracing pipeline.

We have nothing to announce just yet.

@gpx1000 this doesn’t really inspire much confidence…

edit: ah, makes sense if funding is not available.

double edit: 😵‍💫 oof, I didnt know about the personal health situation. That’s definitely more important.

Is it possible for MoltenVK to officially assign a funded developer to look into at least ray query support?

VK_KHR_portability_subset comes to rescue, lets make VK_KHR_portability_subset2 and require the TLASes only be accessed from descriptor sets (you can declare special bindings for TLASes btw, you don't need to make them from BDA)

@natevm
Copy link

natevm commented Oct 30, 2024

VK_KHR_portability_subset comes to rescue, lets make VK_KHR_portability_subset2 and require the TLASes only be accessed from descriptor sets (you can declare special bindings for TLASes btw, you don't need to make them from BDA)

I haven’t been looking to closely, but I don’t think that solves the problem.

This constraint isn’t device-side, this is a host-side problem. This isn’t about accessing TLAS either, it’s about storing BLAS in TLAS

This is a problem in translating all VKRT code, where users today populate the BLAS device address in an instance primitive (either on the CPU or GPU) which are then passed into to the accel build calls to create the TLAS. Changing this would require a Vulkan spec change, which I don’t personally think makes much sense.

This paradigm cannot be easily translated to Metal, since Metal accels cannot be referenced by a VKDeviceAddress.

@AntarticCoder knows more than me about this.

@devshgraphicsprogramming

VK_KHR_portability_subset comes to rescue, lets make VK_KHR_portability_subset2 and require the TLASes only be accessed from descriptor sets (you can declare special bindings for TLASes btw, you don't need to make them from BDA)

I haven’t been looking to closely, but I don’t think that solves the problem.

This constraint isn’t device-side, this is a host-side problem. This isn’t about accessing TLAS either, it’s about storing BLAS in TLAS

This is a problem in translating all VKRT code, where users today populate the BLAS device address in an instance primitive (either on the CPU or GPU) which are then passed into to the accel build calls to create the TLAS. Changing this would require a Vulkan spec change, which I don’t personally think makes much sense.

This paradigm cannot be easily translated to Metal, since Metal accels cannot be referenced by a VKDeviceAddress.

@AntarticCoder knows more than me about this.

BTW Afaik, METAL 3 has something similar to BDA.

For any host command, you could use a BDA to Metal BLAS and TLAS handle mapping.

Then the only problem for you is translating BDA to BLAS handles when doing device-side builds, this could be done with a hashmap / small translation dispatch before you proceed to build TLASes device side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests