Skip to content

metal : simplify kernel arguments using a struct #3229

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ggerganov opened this issue Sep 17, 2023 · 10 comments
Closed

metal : simplify kernel arguments using a struct #3229

ggerganov opened this issue Sep 17, 2023 · 10 comments
Assignees
Labels
good first issue Good for newcomers refactoring Refactoring roadmap Part of a roadmap project

Comments

@ggerganov
Copy link
Member

ggerganov commented Sep 17, 2023

Create a struct ggml_metal_locals and populate using GGML_TENSOR_LOCALS similar to what we do in ggml.c:

https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml.c#L244-L256

Refactor all kernels to accept a single struct of ggml_metal_locals in order to avoid long lists of arguments such as:

https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml-metal.m#L753-L782

https://github.com/ggerganov/llama.cpp/blob/3b4bab6a38502d9e68587c2c19f26472480ec4dd/ggml-metal.metal#L29-L61

@ggerganov ggerganov added good first issue Good for newcomers refactoring Refactoring labels Sep 17, 2023
@ZacharyDK
Copy link

Hey @ggerganov . So I've done some digging. Objective C is a headache, but necessary because Apple makes it a requirement to use metal. Unreal Engine is C++, and they use a C++ wrapper library, avoiding all objective C. This is the library by naleksiev.

https://github.com/naleksiev/mtlpp/blob/master/LICENSE

Refactoring the ggml-metal.m file and relevant files to use this library would have the benefits of cutting out objective C, simplifying the code base, and squashing any bugs related to using objective C. Also would likely fix the numerous kernel loading bugs on Macs with AMD. This change should let Mac users utilize the GPU on whatever, shouldn't make a difference between M1, M2, and AMD

The mtlpp library has been tried and tested with Unreal Engine, so it probably will do the heaving lifting without too much pain.

@ggerganov
Copy link
Member Author

✨✨ Here's an AI-assisted sketch of how you might approach this issue saved by @ggerganov using Copilot Workspace v0.17

Topic

Has the ggml_metal_locals struct been created and used to simplify kernel arguments in the metal kernels?

Before
  • No, the ggml_metal_locals struct has not been created in the code.
  • The metal kernels still use long lists of arguments as shown in ggml-metal.m and ggml-metal.metal.
  • There is no evidence in the provided code snippets of a struct being used to simplify kernel arguments for Metal kernels.
After
  • Yes, the ggml_metal_locals struct has been created and is now used in the Metal kernels.
  • The struct is defined in ggml-metal.h, simplifying kernel arguments by encapsulating them.
  • Metal kernels in ggml-metal.metal now accept a single ggml_metal_locals struct as their argument.
  • The long lists of arguments in Metal kernels are replaced with the ggml_metal_locals struct, enhancing code readability and maintainability.
Plan
  • ggml-metal.h (CHANGE)
    • Define a new struct ggml_metal_locals to encapsulate kernel arguments.
    • Include necessary headers and dependencies.
  • ggml-metal.m (CHANGE)
    • Refactor all kernel function calls to use the ggml_metal_locals struct for arguments.
    • Update the function signatures to accept ggml_metal_locals instead of individual arguments.
  • ggml-metal.metal (CHANGE)
    • Update kernel function signatures to accept a single ggml_metal_locals struct as their argument.
    • Adjust the kernel implementations to access arguments through the ggml_metal_locals struct.
Sketch of implementation

View the changes

Details

Code analyzed at b228aba

@ggerganov
Copy link
Member Author

Playing with the tech preview of "Copilot Workspaces": https://copilot-workspace.githubnext.com/ggerganov/llama.cpp/issues/3229?shareId=9c38fc11-f7d8-45b7-b1bc-81678a27a9e0

It does not like big files 😢

image

@ggerganov ggerganov self-assigned this Nov 10, 2024
@ggerganov ggerganov moved this from Todo to In Progress in ggml : roadmap Nov 10, 2024
@ggerganov ggerganov added the roadmap Part of a roadmap project label Feb 4, 2025
@BB-fat
Copy link
Contributor

BB-fat commented Mar 5, 2025

@ggerganov Is help still needed with this issue? If so, I can try.

@ggerganov
Copy link
Member Author

Yes. It's pretty straight-forward - just apply the same pattern as in #10238 for the rest of the operators.

@BB-fat
Copy link
Contributor

BB-fat commented Mar 5, 2025

@ggerganov Okay, I'm happy to help, please assign it to me.

@ggerganov
Copy link
Member Author

please assign it to me

It's best if you open a draft PR so people can track your progress. Otherwise the experience is that an assigned issue might end up dead because people who want to work on it would think that someone else is already working on it, while they aren't.

@BB-fat
Copy link
Contributor

BB-fat commented Mar 5, 2025

please assign it to me

It's best if you open a draft PR so people can track your progress. Otherwise the experience is that an assigned issue might end up dead because people who want to work on it would think that someone else is already working on it, while they aren't.

Appreciate the suggestion! I'll open a draft PR.

@BB-fat
Copy link
Contributor

BB-fat commented Mar 5, 2025

Hi @ggerganov , I've implemented the struct-based parameter optimization for the im2col kernel. If you have time, could you please review my changes? Assuming everything looks good, I'll continue optimizing the remaining kernel functions in the near future. #12194

danbev pushed a commit that referenced this issue Mar 7, 2025
* metal : refactor im2col parameters into a struct

* metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets

* metal : refactor sum_rows parameters into a struct

* metal : refactor soft_max parameters into a struct

* metal : refactor diag_mask_inf parameters into a struct

* metal : refactor ssm_conv parameters into a struct

* metal : refactor ssm_scan parameters into a struct

* metal : refactor get_rows parameters into a struct

* metal : refactor group_norm parameters into a struct

* metal : refactor conv_transpose_1d parameters into a struct

* metal : refactor upscale parameters into a struct

* metal : refactor pad parameters into a struct

* metal : refactor pad_reflect_1d parameters into a struct

* metal : refactor arange parameters into a struct

* metal : refactor timestep_embedding parameters into a struct

* metal : refactor argsort parameters into a struct

* metal : refactor leaky_relu parameters into a struct

* metal : refactor pool_2d parameters into a struct

* metal : fix trailing whitespace

---------

Co-authored-by: alexju <alexju@tencent.com>
@ggerganov
Copy link
Member Author

Resolved via #12194

@ggerganov ggerganov moved this from In Progress to Done in ggml : roadmap Mar 7, 2025
mglambda pushed a commit to mglambda/llama.cpp that referenced this issue Mar 8, 2025
…l-org#12194)

* metal : refactor im2col parameters into a struct

* metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets

* metal : refactor sum_rows parameters into a struct

* metal : refactor soft_max parameters into a struct

* metal : refactor diag_mask_inf parameters into a struct

* metal : refactor ssm_conv parameters into a struct

* metal : refactor ssm_scan parameters into a struct

* metal : refactor get_rows parameters into a struct

* metal : refactor group_norm parameters into a struct

* metal : refactor conv_transpose_1d parameters into a struct

* metal : refactor upscale parameters into a struct

* metal : refactor pad parameters into a struct

* metal : refactor pad_reflect_1d parameters into a struct

* metal : refactor arange parameters into a struct

* metal : refactor timestep_embedding parameters into a struct

* metal : refactor argsort parameters into a struct

* metal : refactor leaky_relu parameters into a struct

* metal : refactor pool_2d parameters into a struct

* metal : fix trailing whitespace

---------

Co-authored-by: alexju <alexju@tencent.com>
arthw pushed a commit to arthw/llama.cpp that referenced this issue Mar 19, 2025
…l-org#12194)

* metal : refactor im2col parameters into a struct

* metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets

* metal : refactor sum_rows parameters into a struct

* metal : refactor soft_max parameters into a struct

* metal : refactor diag_mask_inf parameters into a struct

* metal : refactor ssm_conv parameters into a struct

* metal : refactor ssm_scan parameters into a struct

* metal : refactor get_rows parameters into a struct

* metal : refactor group_norm parameters into a struct

* metal : refactor conv_transpose_1d parameters into a struct

* metal : refactor upscale parameters into a struct

* metal : refactor pad parameters into a struct

* metal : refactor pad_reflect_1d parameters into a struct

* metal : refactor arange parameters into a struct

* metal : refactor timestep_embedding parameters into a struct

* metal : refactor argsort parameters into a struct

* metal : refactor leaky_relu parameters into a struct

* metal : refactor pool_2d parameters into a struct

* metal : fix trailing whitespace

---------

Co-authored-by: alexju <alexju@tencent.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers refactoring Refactoring roadmap Part of a roadmap project
Projects
None yet
Development

No branches or pull requests

3 participants