-
Notifications
You must be signed in to change notification settings - Fork 11.4k
metal : simplify kernel arguments using a struct #3229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hey @ggerganov . So I've done some digging. Objective C is a headache, but necessary because Apple makes it a requirement to use metal. Unreal Engine is C++, and they use a C++ wrapper library, avoiding all objective C. This is the library by naleksiev. https://github.com/naleksiev/mtlpp/blob/master/LICENSE Refactoring the ggml-metal.m file and relevant files to use this library would have the benefits of cutting out objective C, simplifying the code base, and squashing any bugs related to using objective C. Also would likely fix the numerous kernel loading bugs on Macs with AMD. This change should let Mac users utilize the GPU on whatever, shouldn't make a difference between M1, M2, and AMD The mtlpp library has been tried and tested with Unreal Engine, so it probably will do the heaving lifting without too much pain. |
✨✨ Here's an AI-assisted sketch of how you might approach this issue saved by @ggerganov using Copilot Workspace v0.17 TopicHas the Before
After
Plan
|
Playing with the tech preview of "Copilot Workspaces": https://copilot-workspace.githubnext.com/ggerganov/llama.cpp/issues/3229?shareId=9c38fc11-f7d8-45b7-b1bc-81678a27a9e0 It does not like big files 😢 |
@ggerganov Is help still needed with this issue? If so, I can try. |
Yes. It's pretty straight-forward - just apply the same pattern as in #10238 for the rest of the operators. |
@ggerganov Okay, I'm happy to help, please assign it to me. |
It's best if you open a draft PR so people can track your progress. Otherwise the experience is that an assigned issue might end up dead because people who want to work on it would think that someone else is already working on it, while they aren't. |
Appreciate the suggestion! I'll open a draft PR. |
Hi @ggerganov , I've implemented the struct-based parameter optimization for the |
* metal : refactor im2col parameters into a struct * metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets * metal : refactor sum_rows parameters into a struct * metal : refactor soft_max parameters into a struct * metal : refactor diag_mask_inf parameters into a struct * metal : refactor ssm_conv parameters into a struct * metal : refactor ssm_scan parameters into a struct * metal : refactor get_rows parameters into a struct * metal : refactor group_norm parameters into a struct * metal : refactor conv_transpose_1d parameters into a struct * metal : refactor upscale parameters into a struct * metal : refactor pad parameters into a struct * metal : refactor pad_reflect_1d parameters into a struct * metal : refactor arange parameters into a struct * metal : refactor timestep_embedding parameters into a struct * metal : refactor argsort parameters into a struct * metal : refactor leaky_relu parameters into a struct * metal : refactor pool_2d parameters into a struct * metal : fix trailing whitespace --------- Co-authored-by: alexju <alexju@tencent.com>
Resolved via #12194 |
…l-org#12194) * metal : refactor im2col parameters into a struct * metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets * metal : refactor sum_rows parameters into a struct * metal : refactor soft_max parameters into a struct * metal : refactor diag_mask_inf parameters into a struct * metal : refactor ssm_conv parameters into a struct * metal : refactor ssm_scan parameters into a struct * metal : refactor get_rows parameters into a struct * metal : refactor group_norm parameters into a struct * metal : refactor conv_transpose_1d parameters into a struct * metal : refactor upscale parameters into a struct * metal : refactor pad parameters into a struct * metal : refactor pad_reflect_1d parameters into a struct * metal : refactor arange parameters into a struct * metal : refactor timestep_embedding parameters into a struct * metal : refactor argsort parameters into a struct * metal : refactor leaky_relu parameters into a struct * metal : refactor pool_2d parameters into a struct * metal : fix trailing whitespace --------- Co-authored-by: alexju <alexju@tencent.com>
…l-org#12194) * metal : refactor im2col parameters into a struct * metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets * metal : refactor sum_rows parameters into a struct * metal : refactor soft_max parameters into a struct * metal : refactor diag_mask_inf parameters into a struct * metal : refactor ssm_conv parameters into a struct * metal : refactor ssm_scan parameters into a struct * metal : refactor get_rows parameters into a struct * metal : refactor group_norm parameters into a struct * metal : refactor conv_transpose_1d parameters into a struct * metal : refactor upscale parameters into a struct * metal : refactor pad parameters into a struct * metal : refactor pad_reflect_1d parameters into a struct * metal : refactor arange parameters into a struct * metal : refactor timestep_embedding parameters into a struct * metal : refactor argsort parameters into a struct * metal : refactor leaky_relu parameters into a struct * metal : refactor pool_2d parameters into a struct * metal : fix trailing whitespace --------- Co-authored-by: alexju <alexju@tencent.com>
Create a struct
ggml_metal_locals
and populate usingGGML_TENSOR_LOCALS
similar to what we do inggml.c
:https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml.c#L244-L256
Refactor all kernels to accept a single struct of
ggml_metal_locals
in order to avoid long lists of arguments such as:https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml-metal.m#L753-L782
https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml-metal.metal#L29-L61
The text was updated successfully, but these errors were encountered: