Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Vulkan Acceleration Structures #1956

Open
AntarticCoder opened this issue Jun 23, 2023 · 23 comments
Open

Implementing Vulkan Acceleration Structures #1956

AntarticCoder opened this issue Jun 23, 2023 · 23 comments

Comments

@AntarticCoder
Copy link
Contributor

AntarticCoder commented Jun 23, 2023

At the moment, MoltenVK does not support raytracing(#427), and to support VK_KHR_ray_tracing_pipeline and VK_KHR_ray_query, we need to implement acceleration structures. PR #1954 (issue #1953) implemented VK_KHR_deferred_host_operations that finishes off the dependencies for VK_KHR_acceleration_structures. The only thing left to do is to actually implement it. This issue will provide a place to discuss the design decisions for acceleration structures.

I'm also planning on trying to implement this myself.

@natevm
Copy link

natevm commented Jun 23, 2023

If it's any help, consolidating some information

In vulkan I have examples of AABB accels here, Triangular geometry accels here, and Instance accels here

These code snippets include building and rebuilding trees, refitting and compaction for each of the types, include things like alignment for the scratch space of the acceleration structure build, etc...

I'm missing some features like storing and loading acceleration structures from memory, but that could be added if could benefit from more reference material.

It appears that these types roughly translate to
MTLAccelerationStructureBoundingBoxGeometryDescriptor,
MTLAccelerationStructureTriangleGeometryDescriptor, and MTLInstanceAccelerationStructureDescriptor

https://developer.apple.com/documentation/metal/mtlaccelerationstructure

@natevm
Copy link

natevm commented Jun 23, 2023

One place that might be a good starting point is populating the structures of data for acceleration structure features and properties. In my code I do that here

In VkPhysicalDeviceAccelerationStructureFeaturesKHR, we could probably just return true for the "accelerationStructure" field, and false for all the other fields.

In VkPhysicalDeviceAccelerationStructurePropertiesKHR, we'd need to somehow figure out various limits imposed by metal ray tracing (max geometry count, instance and primitive count, minimum accel scratch offset alignment, etc)... I'm not sure how these are queried in Metal RT tbh

@AntarticCoder
Copy link
Contributor Author

AntarticCoder commented Jun 23, 2023

Looking at VkAccelerationStructureGeometryDataKHR and MTLAccelerationStructureTriangleGeometryDescriptor seem to be almost identical with a few minor differences.

One I noticed was the in the index type for the geometry descriptor, Vulkan allows you to simply pass in no indices along with the standard uint16 and uint32, however Metal does not seem to have an option of none within their index type struct.

Vulkan Index Type

Metal Index Type

@AntarticCoder
Copy link
Contributor Author

@natevm I'm not sure if I'm looking in the wrong place however this link to the metal documentation seems to tell us the max count for some of these properties in standard and extended mode, iiuc.

https://developer.apple.com/documentation/metal/mtlaccelerationstructureusage/3750490-extendedlimits

@natevm
Copy link

natevm commented Jun 23, 2023

@natevm I'm not sure if I'm looking in the wrong place however this link to the metal documentation seems to tell us the max count for some of these properties in standard and extended mode, iiuc.

https://developer.apple.com/documentation/metal/mtlaccelerationstructureusage/3750490-extendedlimits

Nice find. Yep, those seem like what I had in mind.

So, we know the following,

// Provided by VK_KHR_acceleration_structure
typedef struct VkPhysicalDeviceAccelerationStructureFeaturesKHR {
    VkStructureType    sType;
    void*              pNext;
    VkBool32           accelerationStructure; // true
    VkBool32           accelerationStructureCaptureReplay; // false (for now)
    VkBool32           accelerationStructureIndirectBuild; // false (for now)
    VkBool32           accelerationStructureHostCommands; // false (for now)
    VkBool32           descriptorBindingAccelerationStructureUpdateAfterBind; // false (for now)
} VkPhysicalDeviceAccelerationStructureFeaturesKHR;

// Provided by VK_KHR_acceleration_structure
typedef struct VkPhysicalDeviceAccelerationStructurePropertiesKHR {
    VkStructureType    sType;
    void*              pNext;
    uint64_t           maxGeometryCount; // "Geometries in primitive acceleration structure, (2^24 / 2^30)
    uint64_t           maxInstanceCount; // "Instances in instance acceleration structure", (2^24 / 2^30)
    uint64_t           maxPrimitiveCount; // "Primitives in primitive acceleration structure", (2^28 / 2^30)
    uint32_t           maxPerStageDescriptorAccelerationStructures; // ???
    uint32_t           maxPerStageDescriptorUpdateAfterBindAccelerationStructures; // ???
    uint32_t           maxDescriptorSetAccelerationStructures; // ???
    uint32_t           maxDescriptorSetUpdateAfterBindAccelerationStructures; // ???
    uint32_t           minAccelerationStructureScratchOffsetAlignment; // ???
} VkPhysicalDeviceAccelerationStructurePropertiesKHR;

Here there is a mention of an alignment derived from "the platform's buffer offset alignment". What I don't entirely know is how metal handles the idea of "scratch" memory for acceleration structure builds.

@rcaridade145
Copy link

@AntarticCoder
Copy link
Contributor Author

AntarticCoder commented Jun 26, 2023

@rcaridade145 Thanks, I think MTLAccelerationStructureSizes.accelerationStructureSize could be used for the vkGetAccelerationStructureBuildSizesKHR function which provides the expected acceleration structure size.

@natevm
Copy link

natevm commented Jun 26, 2023

@natevm https://developer.apple.com/documentation/metal/mtlaccelerationstructuresizes/3553967-accelerationstructuresize and https://developer.apple.com/videos/play/wwdc2023/10128/?time=564 are of interest to you?

ah yeah, the "buildScratchBufferSize" in that first link was one of the things I was wondering about. Still not sure what the "minAccelerationStructureScratchOffsetAlignment" should be for that buffer, @rcaridade145 do you know what minimum offset alignment rules there might be?

@Try
Copy link

Try commented Jun 26, 2023

Just a small note about Metal BLAS:
Documentation about MTL::AccelerationStructureTriangleGeometryDescriptor::setIndexBufferOffset says:

Specify an offset that is a multiple of the index data type size and a multiple of the platform’s buffer offset alignment.

In feature table https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf,
buffer offset alignment ranges from 4 bytes to 32 bytes (Mac2).
In Vulkan primitiveOffset must be multiple of component size.

@natevm
Copy link

natevm commented Jun 26, 2023

Just a small note about Metal BLAS: Documentation about MTL::AccelerationStructureTriangleGeometryDescriptor::setIndexBufferOffset says:

Specify an offset that is a multiple of the index data type size and a multiple of the platform’s buffer offset alignment.

In feature table https://developer.apple.com/metal/Metal-Feature-Set-Tables.pdf, buffer offset alignment ranges from 4 bytes to 32 bytes (Mac2). In Vulkan primitiveOffset must be multiple of component size.

do you know if there is a pragmatic way to query this buffer offset alignment?

@Try
Copy link

Try commented Jun 26, 2023

do you know if there is a pragmatic way to query this buffer offset alignment?

Oh, I wish to know, but doesn't seem to be any

@rcaridade145
Copy link

@natevm https://developer.apple.com/documentation/metal/mtlaccelerationstructuresizes/3553967-accelerationstructuresize and https://developer.apple.com/videos/play/wwdc2023/10128/?time=564 are of interest to you?

ah yeah, the "buildScratchBufferSize" in that first link was one of the things I was wondering about. Still not sure what the "minAccelerationStructureScratchOffsetAlignment" should be for that buffer, @rcaridade145 do you know what minimum offset alignment rules there might be?

Not really.
All the info i could find was

https://github.com/MetalKit/metal/blob/master/raytracing/Renderer.swift

It seems to use alignedUniformsSize .

https://gist.github.com/ctreffs/1cf72cd0d5e23d77fe55a011ea01a153

@AntarticCoder
Copy link
Contributor Author

Is it possible to get a scratch buffer from it's device address? Looking at the Metal API documentation, there's basically nothing on device addresses, except for a single property on the MTLBuffer. I know NVIDIA used to pass in a VkBuffer directly but now we have to use device addresses.

@rcaridade145
Copy link

Will this help @AntarticCoder https://developer.apple.com/documentation/metal/mtlbuffer/1515716-contents ?

@AntarticCoder
Copy link
Contributor Author

I believe I saw this during my research, but I probably didn't read the docs properly. I'll try it out later. Thanks @rcaridade145

@rcaridade145
Copy link

The problem here is that afaik the scratch buffer is handled by Metal itself so perhaps you cannot use the contents function only with a custom buffer?

@K0bin
Copy link

K0bin commented Jul 8, 2023

@AntarticCoder @rcaridade145 The contents function will just give you a CPU pointer to the data of a shared buffer. That's not useful here unless you want to copy all the data around on the CPU every time. (which would also involve a GPU sync)

What you have to do is basically maintain a map that maps BDA VAs to their original buffer objects. Keep in mind that this VA map has to be extremely fast and should minimize locking as much as possible.
An example for that can be found in vkd3d-Proton:
https://github.com/HansKristian-Work/vkd3d-proton/blob/master/libs/vkd3d/va_map.c

@AntarticCoder
Copy link
Contributor Author

@K0bin This looks quite interesting, I'll see if i can get an efficient map working later.

@billhollings
Copy link
Contributor

do you know if there is a pragmatic way to query this buffer offset alignment?

Check MVKPhysicalDeviceMetalFeatures::mtlBufferAlignment.

@natevm
Copy link

natevm commented Sep 14, 2023

With iPhone 15 now having native hardware ray tracing support, I am guessing M3 is soon to follow suit. @AntarticCoder what's the status on this PR? Any blocking issues we should know about?

@AntarticCoder
Copy link
Contributor Author

@natevm The only real blocking issue is how accelerations are handled in gpu memory because we have copy commands and noncommand copies. The solution seems to MTLHeaps accoring to a commenter on the PR. As for the status, I've been a bit busy with personal matters, but I've definitely wanted to get back into this. I could probably continue working next week. Thanks

@natevm
Copy link

natevm commented Sep 14, 2023

@AntarticCoder totally understand. I’ll check out the MTLHeaps proposal on the PR.

I don’t suppose you have a discord where we could stay in touch, do you? Over there my username’s @natemorrical. We have a little Vulkan raytracing research group there that acts a bit like a slack space. If not, no worries, but figured I’d ask just in case :)

@AntarticCoder
Copy link
Contributor Author

@natevm I just send a friend request. My username is Noble 6 the Penguin. 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants