-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tutorials] Add tutorial on JIT compile/execute performance #7838
Conversation
I think the cases to test here are the ones that match real usage patterns:
Another subtlety we could consider is the difference between allocating the output buffer outside the benchmarking loop vs using the form of realize that allocates an output for you, e.g. realize({1024, 1024}); |
Add timing estimates as comments. Add std::function example. Enable advanced scheduling directives.
Ah, yes ... great suggestions! I'll swap out the contrived examples and add the above use cases you listed. |
Added cases that match real usage patterns: 1. Defining and compiling the whole pipeline every time you want to run it (i.e. in the benchmarking loop) 2. Defining the pipeline outside the benchmarking loop, and realizing it repeatedly. 3. (optional) Same as 2), but calling compile_jit() outside the loop, saying what it does, and saying why the time isn't actually different to case 2 (benchmark() runs multiple times and takes a min, and realize only compiiles on the first run) 4. Compiling to a callable outside the benchmarking loop and showing that it has lower overhead than case 3 (if indeed it does. If not we may need to change the example so that it does, e.g. by adding a real input buffer.)
Updated with new cases as suggested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More minor nits but otherwise this is great, LGTM
// calling convention. | ||
auto arguments = pipeline.infer_arguments(); | ||
|
||
// The Callable object acts as a convienient way of invoking the compiled code like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
convenient
(side note: it's 2023, why don't we have smart spellcheckers that are useful for typos in code comments?)
) * Add tutorial on JIT compile/execute performance * Addressing comments from review. Fix punctuation and comment nits. Add timing estimates as comments. Add std::function example. Enable advanced scheduling directives. * Addressing comments from review. Added cases that match real usage patterns: 1. Defining and compiling the whole pipeline every time you want to run it (i.e. in the benchmarking loop) 2. Defining the pipeline outside the benchmarking loop, and realizing it repeatedly. 3. (optional) Same as 2), but calling compile_jit() outside the loop, saying what it does, and saying why the time isn't actually different to case 2 (benchmark() runs multiple times and takes a min, and realize only compiiles on the first run) 4. Compiling to a callable outside the benchmarking loop and showing that it has lower overhead than case 3 (if indeed it does. If not we may need to change the example so that it does, e.g. by adding a real input buffer.) * Addressing comments from review for style nits, and typos in comments. --------- Co-authored-by: Derek Gerstmann <dgerstmann@adobe.com> Co-authored-by: Steven Johnson <srj@google.com>
Compares performance of realize(), compile_jit(), compile_callable(), compile_module() and shows benefits of JIT cache.