-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Tuning Support (Umbrella Issue) #16952
Comments
Nice / thank you! Various people have been doing this in a pretty ad-hoc way for years, and it is definitely profitable to do. Would be really nice to have it be a good and supported flow! |
Support disabling workgrouop reordering and shared memory optimization passes based on translation info config entries. Because these are just named unit attributes, they do not require custom attributes defined in tablegen. These are intended for tuning. Issue: iree-org#16952
Support disabling workgrouop reordering and shared memory optimization passes based on translation info config entries. Because these are just named unit attributes, they do not require custom attributes defined in tablegen. These are intended for tuning. Issue: iree-org#16952
Support disabling workgrouop reordering and shared memory optimization passes based on translation info config entries. Because these are just named unit attributes, they do not require custom attributes defined in tablegen. These are intended for tuning. Issue: iree-org#16952
Support disabling workgroup reordering and shared memory optimization passes based on translation info config entries. Because these are just named unit attributes, they do not require custom attributes defined in tablegen. These are intended for tuning. Issue: #16952
…rg#17340) Support disabling workgroup reordering and shared memory optimization passes based on translation info config entries. Because these are just named unit attributes, they do not require custom attributes defined in tablegen. These are intended for tuning. Issue: iree-org#16952
…rg#17340) Support disabling workgroup reordering and shared memory optimization passes based on translation info config entries. Because these are just named unit attributes, they do not require custom attributes defined in tablegen. These are intended for tuning. Issue: iree-org#16952 Signed-off-by: Lubo Litchev <lubol@google.com>
The scripts that drive the tuning loop landed in the sharktank repo: nod-ai/shark-ai#141 and nod-ai/shark-ai#158. |
Adding a few new issues related to the design of the tuner: |
This is an umbrella issue for implementing a tuning infrastructure. By tuning we mean a type of Profile Guided Optimization flow where we compile a program/model with extra instrumentation and use the runtime performance numbers to tweak the compilation parameters to achieve better performance. Concretely, this translates to benchmarking dispatches and using the results to apply different
#ireee_codegen.compilation_info
attributes to root ops, which includes lowering config (with tile sizes), translation info (with the codegen pipeline, workgroup/subgroup sizes, and mma schedule).The main tuning loop will driven by a python script with the bulk of the implementation split across a few existing tools. We plan to implement it as follows:
iree-compile
allows for dumping instrumented benchmarks to a directory. This is similar to the existing flag--iree-hal-dump-executable-benchmarks-to=
, with each benchmark being dumped in a separate file, possibly with some top-level shared manifest file if necessary.iree-run-module
dumps profile data using the collected trace. This includes precise dispatch mapping and information about (dynamic) shapes, workgroup counts, etc.iree-compile
as a separate process. First, the existing configuration is stripped and replaced with the one from the tuning spec, and then compilation resumes from the level of executable sources. The compilation either succeeds or the verifier rejects the compilation info. It is the responsibility of the compiler to reconcile the compilation info across all ops in the module.In the v0 for SD-family of models, we do not have to support dynamic shapes. Initially, the dispatches to tune will be selected by the user; later we can extend the tuning script to identify those automatically based on the generated trace.
The text was updated successfully, but these errors were encountered: