Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metal Task Shader payload #4238

Merged
merged 16 commits into from
Jun 2, 2024
Merged

Conversation

Dynamitos
Copy link
Contributor

Starting to work on #3995

I started implementing some basic mesh shading features, for now something basic since this is my first time working with the codebase. Before merging this I would like to at least complete the task shader part by generating the group spawning call. However I cannot figure out how to do a function call on the mesh_grid_properties parameter that is emitted.

Copy link
Collaborator

@jkwak-work jkwak-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me.
But I will defer the approval to Yong.

@csyonghe
Copy link
Collaborator

@Dynamitos Can we do this for DispatchMesh:
In hlsl.meta.slang::DispatchMesh<P>, add a target switch case for metal:

__target_switch
{
...
case metal:
    __intrinsic_asm "_slang_mgp.set_threadgroups_per_grid(uint3($0, $1, $2)); _slang_mesh_payload = $3; return";
}

Then you just need to make sure to emit _slang_mpg and _slang_mesh_payload as additional parameters in the object shader entrypoint, you can inject those parameters during legalization before you hit emit. This will only work if DispatchMesh is called directly from the amplification shader and not in a different function.

@Dynamitos
Copy link
Contributor Author

@csyonghe Thanks for the tip, implemented it just now. For now a bit of a hack since I couldn't get the params to be called exactly _slang_mgp, and also I'm not sure if passing the payload as pointer even works, but since everything is emitted as a pointer, it looks a little strange, but it should work. Caveat of course is that the call to DispatchMesh needs to happen in the entry point, because the parameters are not passed to other functions.

@Dynamitos Dynamitos marked this pull request as ready for review May 30, 2024 07:26
// But first we need to create the parameter
const auto meshGridPropertiesType = builder.getMetalMeshGridPropertiesType();
auto mgp = builder.emitParam(meshGridPropertiesType);
builder.addNameHintDecoration(mgp, toSlice("_slang_mgp"));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you also add ExternCppDecoration here, it will be emitted as is without _0 suffix.

float4x4 modelViewProjection;
}

// CHECK-ASM: define void @taskMain
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no //CHECK: lines in the file, so test in line 1 will fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what those check comments do, copied them from the rasterization test file.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are regular expressions used to match the result of the generated source file or buffer content.

A //CHECK line will try to match a line in the output with the regex following CHECK:, if such a match if found, it is successful, otherwise it will fail the test.

// TEST:SIMPLE(filecheck=CHECK-ASM): -target metallib
// TEST:REFLECTION(filecheck=REFLECT):-target metal -entry main_kernel -stage task

// CHECK: 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand these results. Can we put something more meaningful here, such as:

// CHECK: _slang_mgp.set_threadgroups_per_grid

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some checks that should work, however I can't get the tests to run on my macbook, the runner keeps ignoring them.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our test runner just try to run the metal command directly, so it needs to be made available from PATH. We should probably enhance the logic in locateMetalCompiler in slang-metal-compiler.cpp so that it can discover metal sdk on macos.

//TEST:SIMPLE(filecheck=CHECK): -target metal
//TEST:SIMPLE(filecheck=CHECK-ASM): -target metallib
//TEST:REFLECTION(filecheck=REFLECT):-target metal -entry main_kernel -stage task
// TEST:SIMPLE(filecheck=CHECK): -target metal
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be a space between // and TEST, otherwise the test won't get picked up by our test driver.

@@ -0,0 +1,103 @@
//TEST:SIMPLE(filecheck=CHECK): -entry taskMain -stage amplification -target metal
//TEST:SIMPLE(filecheck=CHECK-ASM): -entry taskMain -stage amplification -target metallib
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably cannot run this test here because we always specify metal2.3 when we run metal compiler. We need to extend the logic to detect the use of metal 3.0 and specify the std accordingly. Before then this is going to fail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I leave this PR like this for now until this is fixed and start working on the mesh stage on a different branch, or continue working on this one?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's disable this asm test by removing this line, and check this PR in.

@csyonghe csyonghe added the pr: non-breaking PRs without breaking changes label Jun 1, 2024
payloadPtrType->getValueType(),
AddressSpace::MetalObjectData
);
auto packedParam = builder.emitParam(annotatedPayloadType);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are more than one DispatchMesh call, this will insert duplicate parameters into the entry point. We should probably make sure we insert at most once for the referencing entry point?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also there could be more than one entry points in a single module. I feel like instead of inserting into the entry point, we should try to find the parent function of the call site, assert that it is a object shader entry point, and insert a param into the EntryPoint if it is t already there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto parentFunc = getParentFunc(call) should give you the user func, you can then builder.setInsertInto(parentFunc->getFirstBlock()->getFirstOrdinaryInst()) for the param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we still assume that the DispatchMesh call is done directly from inside the entry point, but we make sure that the payload parameter is only generated once per entrypoint?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's do that for now. we can add a diagnostic if oarent func isnt an entrypoint.

@csyonghe
Copy link
Collaborator

csyonghe commented Jun 1, 2024

Another way to do it, is to do the mesh EntryPoint legalization before anything else, and just declare a global variable for mgp, and just access that global variable for the call.

Then all these global variable accesses will be legalized together with other global variables (we declare them as entry point parameter and pass them around to other functions via a global context object).

@csyonghe
Copy link
Collaborator

csyonghe commented Jun 1, 2024

The call to introduceExplicitGlobalContext in slang-emit.cpp:1115 in linkAndOptimizeIR() does the global variable to context object transformation.

@Dynamitos
Copy link
Contributor Author

How do we handle different entry points using different types for their payloads? Since the variable name is fixed through the intrinsic asm, having two entry points with different data types would maybe be a problem? I don't know the internals well enough to judge that, its just something that comes to mind.

Also, payload data is passed by reference, I don't know if how the interaction with the global context object would work, especially since the payload is located in its own special object_data address space.

The HLSL/Direct-X spec states that the DispatchMesh intrinsic can only be 'called exactly once per amplification shader, must not be called from non-uniform flow control'. Metal would support calling it multiple times, so I would still like to enable that, but perhaps by simply going up the call chain all the way to the entry point and asserting that all of them need to be amplification entry points.

@csyonghe
Copy link
Collaborator

csyonghe commented Jun 1, 2024

It maybe fine to define a global variable that is a pointer to the object data address space, and just store into that pointer at DispatchMesh call. Then at the create context pass we can create a field for the context type that is also a pointer in object data address space, and copy the pointer parameter into that field at the beginning of the entry point. The handling logic there should be exactly the same as other in / out global variables. In some sense, this is exactly like a global output variable.

@Dynamitos
Copy link
Contributor Author

I tried to implement it the way you described, but to be completely honest, this might be a little to difficult for me. I understand what you mean and what the result is supposed to be like, but the globals get optimized away and it is pretty late in my timezone, will try again tomorrow.

@csyonghe
Copy link
Collaborator

csyonghe commented Jun 1, 2024

No worries. We can check in this PR as is. I can take a look at this once I got time.

@Dynamitos
Copy link
Contributor Author

Should I update the branch or will this be merged into the main automatically?

@csyonghe csyonghe merged commit 753a524 into shader-slang:master Jun 2, 2024
17 checks passed
@csyonghe
Copy link
Collaborator

csyonghe commented Jun 2, 2024

Looks good, merged. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr: non-breaking PRs without breaking changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants