Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Optimization] Add custom NCHW to NHWC kernel for implicit GEMM #2530

Merged
merged 4 commits into from
Nov 25, 2024

Conversation

wingertge
Copy link
Contributor

Pull Request Template

Checklist

  • Confirmed that run-checks all script has been executed.
  • Made sure the book is up to date with changes in this PR.

Changes

Adds a custom NCHW to NHWC transpose kernel for use in implicit_gemm. This is faster than normal into_contiguous by specializing on this specific transposition.

Testing

All tests compatible with implicit_gemm pass with the new kernel, added a new test to ensure the kernel output is the same as into_contiguous.

Copy link

codecov bot commented Nov 24, 2024

Codecov Report

Attention: Patch coverage is 44.50262% with 106 lines in your changes missing coverage. Please review.

Project coverage is 82.53%. Comparing base (9c31f75) to head (8b3b0df).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
...tes/burn-jit/src/kernel/conv/conv2d/layout_swap.rs 37.10% 100 Missing ⚠️
...s/burn-jit/src/kernel/conv/conv2d/implicit_gemm.rs 0.00% 3 Missing ⚠️
crates/burn-jit/src/template/base.rs 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2530      +/-   ##
==========================================
- Coverage   82.59%   82.53%   -0.07%     
==========================================
  Files         827      828       +1     
  Lines      106712   106897     +185     
==========================================
+ Hits        88143    88231      +88     
- Misses      18569    18666      +97     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@nathanielsimard nathanielsimard merged commit 0b614b7 into tracel-ai:main Nov 25, 2024
11 checks passed
let source_template = self.kernel_source.source();
let source = source_template.complete();

CompiledKernel {
name: Some(core::any::type_name::<K>()),
entrypoint_name: "kernel".to_string(),
Copy link
Contributor

@AsherJingkongChen AsherJingkongChen Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you change entry point name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an update to cubecl, name is now debug_name and there's a new entrypoint_name. However, it should be "main" and not "kernel", I'm fixing that in the PR I'm currently opening.

@wingertge wingertge deleted the opt/conv-custom-transpose branch November 26, 2024 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants