-
Notifications
You must be signed in to change notification settings - Fork 11.4k
metal : simplify kernel arguments using a struct (#3229) #12194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks OK so far. This kernel has mixed up the element indexing with the stride offsets, which is not ideal. But it can be fixed later.
int32_t ofs0; | ||
int32_t ofs1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These have to be uint64_t
since they are memory address strides. Also change the types in ggml_metal.m
on lines 3281,3282 to uint64_t
. Ideally there should be no division by 4
there - it assumes the data is float
.
You can also just leave a comment TODO here and I'll adjust these later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I have changed the type.
@ggerganov I've completed the PR and would appreciate your review when you have time. I've already run all the tests locally and they're passing successfully. |
I've test this and can confirm that it fixes #12219. |
Just fix the trailing whitespaces and we can merge. |
Finished. |
Thank you. In case you are interested in contributing further, #12199 would be nice to fix. |
I'll see what I can do. |
…l-org#12194) * metal : refactor im2col parameters into a struct * metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets * metal : refactor sum_rows parameters into a struct * metal : refactor soft_max parameters into a struct * metal : refactor diag_mask_inf parameters into a struct * metal : refactor ssm_conv parameters into a struct * metal : refactor ssm_scan parameters into a struct * metal : refactor get_rows parameters into a struct * metal : refactor group_norm parameters into a struct * metal : refactor conv_transpose_1d parameters into a struct * metal : refactor upscale parameters into a struct * metal : refactor pad parameters into a struct * metal : refactor pad_reflect_1d parameters into a struct * metal : refactor arange parameters into a struct * metal : refactor timestep_embedding parameters into a struct * metal : refactor argsort parameters into a struct * metal : refactor leaky_relu parameters into a struct * metal : refactor pool_2d parameters into a struct * metal : fix trailing whitespace --------- Co-authored-by: alexju <alexju@tencent.com>
…l-org#12194) * metal : refactor im2col parameters into a struct * metal: Change im2col offset types from int32_t to uint64_t to support larger memory offsets * metal : refactor sum_rows parameters into a struct * metal : refactor soft_max parameters into a struct * metal : refactor diag_mask_inf parameters into a struct * metal : refactor ssm_conv parameters into a struct * metal : refactor ssm_scan parameters into a struct * metal : refactor get_rows parameters into a struct * metal : refactor group_norm parameters into a struct * metal : refactor conv_transpose_1d parameters into a struct * metal : refactor upscale parameters into a struct * metal : refactor pad parameters into a struct * metal : refactor pad_reflect_1d parameters into a struct * metal : refactor arange parameters into a struct * metal : refactor timestep_embedding parameters into a struct * metal : refactor argsort parameters into a struct * metal : refactor leaky_relu parameters into a struct * metal : refactor pool_2d parameters into a struct * metal : fix trailing whitespace --------- Co-authored-by: alexju <alexju@tencent.com>
This PR implements the improvement suggested in issue #3229 by introducing a
ggml_metal_locals
struct to simplify Metal kernel arguments. This improvement reduces lengthy parameter lists, making the code more readable and maintainable.This draft PR currently only implements the optimization for the
im2col
kernel function. I plan to continue working on optimizing the remaining kernel functions in subsequent commits.References
This PR follows the implementation pattern from PR #10238.