Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dont attempt naive reduction when reduce_dim is too high #2414

Merged
merged 2 commits into from
Oct 30, 2024

Conversation

ArthurBrussee
Copy link
Contributor

Pull Request Template

Changes

Currently, when autotuning a reduction burn might attempt a naive reduction even for something like the mean() of an 800x800 image. This is sufficiently slow that it's problematic - taking over ~2-3 seconds of GPU time on an M1. This seems to be especially problematic on wasm where tuning is asynchronous, so as other work is running it can take even longer, meaning the app has bad performance for up to minutes before tuning is complete.

This changes it so naive reduction isn't tried above 8000 elements. That's arbitrary - but I suspect it's well above the threshold where a naive reduction makes sense. Happy to bump it up even further if it's a concern.

Testing

My app warms up much faster on WASM with this change!

@ArthurBrussee ArthurBrussee changed the title Dont attempt naive when reduce_dim is too high Dont attempt naive reduction when reduce_dim is too high Oct 23, 2024
Copy link

codecov bot commented Oct 23, 2024

Codecov Report

Attention: Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.29%. Comparing base (cb90cc1) to head (5cd0011).
Report is 10 commits behind head on main.

Files with missing lines Patch % Lines
crates/burn-jit/src/kernel/reduce/tune/base.rs 75.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2414   +/-   ##
=======================================
  Coverage   85.29%   85.29%           
=======================================
  Files         792      792           
  Lines      104479   104487    +8     
=======================================
+ Hits        89117    89124    +7     
- Misses      15362    15363    +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ArthurBrussee
Copy link
Contributor Author

I know less about this, but it seems running a shared kernel when the reduce_dim is very small can also be very slow and cause issues. Should similair logic apply there?

@laggui
Copy link
Member

laggui commented Oct 25, 2024

I know less about this, but it seems running a shared kernel when the reduce_dim is very small can also be very slow and cause issues. Should similair logic apply there?

I haven't worked on autotune much but it seems like this should also apply.

@ArthurBrussee
Copy link
Contributor Author

Is this looking ok? Would be nice to have all the autotuning changes in!

@nathanielsimard nathanielsimard merged commit bb9f5b1 into tracel-ai:main Oct 30, 2024
11 checks passed
@ArthurBrussee ArthurBrussee deleted the dim-reduce-naive branch November 1, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants