Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Implementing Acceleration Structures #1967

Open
wants to merge 43 commits into
base: main
Choose a base branch
from

Conversation

AntarticCoder
Copy link
Contributor

@AntarticCoder AntarticCoder commented Jul 6, 2023

This PR provides an implementation of the VK_KHR_acceleration_structure extension which provides a gateway to ray queries and ray tracing pipelines. This PR is still very WIP due to not being anywhere close to done. The reason for opening this PR so early on is to allow for more concrete discussion of the implementation of acceleration structures, and also keeps people up to date on the implementation.

This PR is related to:
#427
#1953 - Not directly related but may have some slight discussion on Acceleration Structures
#1956

Just setup for acceleration structures by adding the definitions of the extension where it is needed. I also added the physical device features and properties that are needed.
This commit adds a few items which are:

* A list of functions that are needed to be implemented
* An implementation of the vkGetAccelerationStructureBuildSizesKHR function
* Fixed the parameters for the create and destroy acceleration structure in MVKDevice
* Added the current functions in vulkan.mm
@AntarticCoder
Copy link
Contributor Author

Acceleration Structure and Raytracing in general does not seem to be supported before MacOS 11, so Xcode 11.7 will always fail.

@cdavis5e
Copy link
Collaborator

cdavis5e commented Jul 6, 2023

Acceleration Structure and Raytracing in general does not seem to be supported before MacOS 11, so Xcode 11.7 will always fail.

Then the parts of MoltenVK that deal with Acceleration Structures need to be inside MVK_XCODE_12 blocks.

@billhollings
Copy link
Contributor

Acceleration Structure and Raytracing in general does not seem to be supported before MacOS 11, so Xcode 11.7 will always fail.

Then the parts of MoltenVK that deal with Acceleration Structures need to be inside MVK_XCODE_12 blocks.

Agreed. But before forcing that, we should discuss whether that makes sense at this point. Xcode 11 is now 4 years old, and at some point, we must give up support for it, from a practicality perspective (like this one). Retaining support for Xcode 11 was added a couple of years ago because some devs required it for their internal processes.

A few months ago, I reached out to the community about this exact question, and received no responses. Unless we can determine a good reason for maintaining Xcode 11, maybe now is the time to drop support for it.

@AntarticCoder
Copy link
Contributor Author

AntarticCoder commented Jul 7, 2023

@billhollings MacOS 11 seems to have support from devices as old as 2013 and newer. So it's a matter of dropping support of these pre-2013 devices, as well as that, some people stay on MacOS 10 for support of 32 bit applications and other reasons. This is just something to take into consideration.

@billhollings
Copy link
Contributor

A few months ago, I reached out to the community about this exact question, and received no responses. Unless we can determine a good reason for maintaining Xcode 11, maybe now is the time to drop support for it.

I have added a ping post to that feedback request thread.

@AntarticCoder Hold off wrapping your code in any MVK_XCODE_12 guards while this PR remains a WIP. When this PR is ready to go, based on any feedback we receive to my query ping, we can decide whether we need to actually implement those guard wraps, or abandon Xcode 11.

@AntarticCoder
Copy link
Contributor Author

@billhollings Alright, I'll hold off on the MVK_XCODE_12 guards. Thanks

@billhollings
Copy link
Contributor

@billhollings MacOS 11 seems to have support from devices as old as 2013 and newer. So it's a matter of dropping support of these pre-2013 devices, as well as that, some people stay on MacOS 10 for support of 32 bit applications and other reasons. This is just something to take into consideration.

The MVK_XCODE_12 guard is strictly for API compilation during MoltenVK builds (ie- will it build with the Metal API supported by Xcode 11). Support for older OS runtimes is handled independently, through things like respondsToSelector:.

@AntarticCoder
Copy link
Contributor Author

AntarticCoder commented Jul 7, 2023

@billhollings Ah, yes my mistake. Also, just a thought but only about 120 people actually watch this repository, so I'm not sure how many people will see your message.

@AntarticCoder AntarticCoder force-pushed the khr-acceleration-structures branch from 898e09d to 5e5c4a7 Compare July 7, 2023 16:51
This commit adds:

* A .h and .mm file for Acceleration Structure commands
* An acceleration structure command encoder into `MVKCommandBuffer`
* An actual acceleration structure handle
* And some other items that are not complete, or need to removed
@AntarticCoder AntarticCoder force-pushed the khr-acceleration-structures branch from 5e5c4a7 to a1b0961 Compare July 7, 2023 16:55
Fixed the missing symbol for getPoolType in MVKCmdBuildAccelerationStructure by including it in MVKCommandPool.h. I also added the Build Acceleration structure command into definitions file.
Finished up what was needed for the MVKCmdBuildAccelerationStructure. The only 2 issues at the moment are the scratch buffer and the scratch buffer offset, to which a solution has been proposed. I plan to discuss this in the PR thread before trying out anything.
@AntarticCoder
Copy link
Contributor Author

AntarticCoder commented Jul 10, 2023

@billhollings @cdavis5e An issue I've run into during this PR, is accessing the provided scratch buffer, via the provided device address. To solve this, I got a reply from @K0bin in issue #1956, which is as followed.

@AntarticCoder @rcaridade145 The contents function will just give you a CPU pointer to the data of a shared buffer. That's not useful here unless you want to copy all the data around on the CPU every time. (which would also involve a GPU sync)

What you have to do is basically maintain a map that maps BDA VAs to their original buffer objects. Keep in mind that this VA map has to be extremely fast and should minimize locking as much as possible. An example for that can be found in vkd3d-Proton: https://github.com/HansKristian-Work/vkd3d-proton/blob/master/libs/vkd3d/va_map.c

Basically create a map from scratch that is fast, and thread safe, and when you call vkGetBufferDeviceAddress, we could push the address along with buffer. I just wanted to ask if this is a good idea, and what you would change about it.

This commit adds the copy acceleration structure, but does not add the commands that copy memory to and from an acceleration structure. As well as that I've added 2 files for a map that will store the device address along with the buffer. This map will also come in handy when getting the device address for the acceleration structure
@K0bin
Copy link

K0bin commented Jul 10, 2023

and when you call vkGetBufferDeviceAddress, we could push the address along with buffer

It's probably better to do that at buffer creation time and keep vkGetBufferDeviceAddress fast.

@AntarticCoder
Copy link
Contributor Author

@K0bin But not every created buffer will be used via the device address. So if you pushed it on vkGetBufferDeviceAddress, you would effectivly be keeping uneeded buffers out of the map.

@K0bin
Copy link

K0bin commented Jul 10, 2023

Base it off of VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT.

@AntarticCoder
Copy link
Contributor Author

Okay then, I'll get started on the implementation. Thanks @K0bin

A half done implementation of MVKMap. MVKMap aims to use the same API as std::unordered_map, and I used MVKSmallVector as an example of how to write MVKMap. I hope there aren't any bugs however, I'll probably do some tests off of the repository once I'm done
@billhollings
Copy link
Contributor

billhollings commented Jul 10, 2023

Base it off of VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT.

Search for VK_BUFFER_USAGE_SHADER_DEVICE_ADDRESS_BIT in existing MoltenVK code. There is already an MVKSmallVector containing a list of these in MVKDevice::_gpuAddressableBuffers. Perhaps this could be modified to use a std::unordered_map, to be used to serve both purposes?

@AntarticCoder
Copy link
Contributor Author

@billhollings That seems like a good idea, I'll go ahead and use that for now, and we can change it in the future if it's not getting the job done.

This commit finished off the build acceleration structure command. This is because in MVKDevice, we are now using a std::unordered_map instead of a custom map implementation.
@AntarticCoder AntarticCoder force-pushed the khr-acceleration-structures branch from 542c3f8 to 9aae084 Compare July 11, 2023 14:31
@AntarticCoder
Copy link
Contributor Author

6/12 commands have been implemented, so I'm halfway there. 🎉

@zmarlon
Copy link

zmarlon commented Dec 26, 2023

Is there any news regarding this PR? I now own an M3 Max Macbook and could test if this is of interest.

@AntarticCoder AntarticCoder force-pushed the khr-acceleration-structures branch 2 times, most recently from c688f5e to 3937822 Compare January 2, 2024 14:11
@kanerogers
Copy link

Our game studio is interested in cross-platform ray tracing with Vulkan, wondering whether there's been any progress here.

Still not finished, just quickly saving my work on get build sizes
@K0bin
Copy link

K0bin commented May 30, 2024

How do you intend to work around the fact that Metal needs a list of all bottom level acceleration structures to build the TLAS while Vulkan only needs a GPU buffer address that contains that data?

You'll probably have to maintain a list that has every single BLAS and use that when creating the Metal TLAS.
Then in vkCmdBuildAccelerationStructure you prepare some kind of hashmap on the CPU for BLAS VkDeviceAddress -> uint32_t index. Then you run a compute shader that prepares the actual MTLAccelerationStructureInstanceDescriptors by doing a hashmap lookup for each instance to get the index.
Not great, maybe you can come up with a simpler solution.

This commit is pretty small and just adds AABBs to be allowed to be pushed to the acceleration structure.
MVKTraceVulkanCallStart();
MVKDevice* mvkDev = (MVKDevice*)device;
VkAccelerationStructureBuildSizesInfoKHR buildSizes = MVKAccelerationStructure::getBuildSizes(mvkDev, buildType, pBuildInfo);
pSizeInfo = &buildSizes;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pSizeInfo should be dereferenced

@KingVulpes
Copy link

Would this work in something like RTX Remix? which uses a modified dxvk to translate DirectX to Vulkan RT. It would be interesting to see something like Portal RTX converted over for Mac

@watbulb
Copy link

watbulb commented Jan 1, 2025

Would this work in something like RTX Remix? which uses a modified dxvk to translate DirectX to Vulkan RT. It would be interesting to see something like Portal RTX converted over for Mac

Given that MoltenVK targets Apple family of devices, and RTX Remix is not.
No. Also hello 👋 😅.

I will say though, that Remix does cover some strategies such as the Remix acceleration manager to provide a generic BLAS tracker which folks here may find useful to reference while implementing similar code. Anyways, this discussion isn't really relevant to the PR, so I will refrain from discussing further.

@cdavis5e
Copy link
Collaborator

cdavis5e commented Jan 1, 2025

Given that MoltenVK targets Apple family of devices, and RTX Remix is not.
No.

Never say "never"...

@watbulb
Copy link

watbulb commented Jan 1, 2025

Given that MoltenVK targets Apple family of devices, and RTX Remix is not.
No.

Never say "never"...

heh, I work on the RTX remix codebase a lot, this isn't on the roadmap. But sure, maybe one day ... :)

@KingVulpes
Copy link

Given that MoltenVK targets Apple family of devices, and RTX Remix is not.
No.

Never say "never"...

heh, I work on the RTX remix codebase a lot, this isn't on the roadmap. But sure, maybe one day ... :)

Well I wasn't implying the RTX Remix devs would integrate it, given RTX Remix is open sourced, I was interested if it could be ported over for personal purposes

@watbulb
Copy link

watbulb commented Jan 1, 2025

Given that MoltenVK targets Apple family of devices, and RTX Remix is not.
No.

Never say "never"...

heh, I work on the RTX remix codebase a lot, this isn't on the roadmap. But sure, maybe one day ... :)

Well I wasn't implying the RTX Remix devs would integrate it, given RTX Remix is open sourced, I was interested if it could be ported over for personal purposes

And I am here to tell you that it is unlikely, considering that it would require RTX remix to be ported to apple. Maybe one day. Remix also only supports D3D9. An API which does not exist on Apple hardware, or is supported in any capacity through translation currently. I'm not following. Anyways, this is derailing the PR convo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.