Dont attempt naive reduction when reduce_dim is too high #2414

ArthurBrussee · 2024-10-23T19:55:56Z

Pull Request Template

Changes

Currently, when autotuning a reduction burn might attempt a naive reduction even for something like the mean() of an 800x800 image. This is sufficiently slow that it's problematic - taking over ~2-3 seconds of GPU time on an M1. This seems to be especially problematic on wasm where tuning is asynchronous, so as other work is running it can take even longer, meaning the app has bad performance for up to minutes before tuning is complete.

This changes it so naive reduction isn't tried above 8000 elements. That's arbitrary - but I suspect it's well above the threshold where a naive reduction makes sense. Happy to bump it up even further if it's a concern.

Testing

My app warms up much faster on WASM with this change!

codecov · 2024-10-23T20:11:06Z

Codecov Report

Attention: Patch coverage is 75.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 85.29%. Comparing base (cb90cc1) to head (5cd0011).
Report is 10 commits behind head on main.

Files with missing lines	Patch %	Lines
crates/burn-jit/src/kernel/reduce/tune/base.rs	75.00%	2 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #2414   +/-   ##
=======================================
  Coverage   85.29%   85.29%           
=======================================
  Files         792      792           
  Lines      104479   104487    +8     
=======================================
+ Hits        89117    89124    +7     
- Misses      15362    15363    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ArthurBrussee · 2024-10-24T12:02:21Z

I know less about this, but it seems running a shared kernel when the reduce_dim is very small can also be very slow and cause issues. Should similair logic apply there?

laggui · 2024-10-25T12:03:50Z

I know less about this, but it seems running a shared kernel when the reduce_dim is very small can also be very slow and cause issues. Should similair logic apply there?

I haven't worked on autotune much but it seems like this should also apply.

ArthurBrussee · 2024-10-29T19:37:31Z

Is this looking ok? Would be nice to have all the autotuning changes in!

ArthurBrussee changed the title ~~Dont attempt naive when reduce_dim is too high~~ Dont attempt naive reduction when reduce_dim is too high Oct 23, 2024

ArthurBrussee added 2 commits October 27, 2024 10:47

Dont attempt naive when reduce_dim is too high

2d22a80

Also skip shared for small dim

5cd0011

ArthurBrussee force-pushed the dim-reduce-naive branch from 651a84c to 5cd0011 Compare October 27, 2024 10:47

nathanielsimard approved these changes Oct 30, 2024

View reviewed changes

nathanielsimard merged commit bb9f5b1 into tracel-ai:main Oct 30, 2024
11 checks passed

ArthurBrussee deleted the dim-reduce-naive branch November 1, 2024 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dont attempt naive reduction when reduce_dim is too high #2414

Dont attempt naive reduction when reduce_dim is too high #2414

ArthurBrussee commented Oct 23, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading

ArthurBrussee commented Oct 24, 2024

laggui commented Oct 25, 2024

ArthurBrussee commented Oct 29, 2024

Dont attempt naive reduction when reduce_dim is too high #2414

Dont attempt naive reduction when reduce_dim is too high #2414

Conversation

ArthurBrussee commented Oct 23, 2024

Pull Request Template

Changes

Testing

codecov bot commented Oct 23, 2024 • edited Loading

Codecov Report

ArthurBrussee commented Oct 24, 2024

laggui commented Oct 25, 2024

ArthurBrussee commented Oct 29, 2024

codecov bot commented Oct 23, 2024 •

edited

Loading