Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cutlass integration + segment_matmul implementation #51

Merged
merged 27 commits into from
Jun 24, 2022
Merged

Conversation

rusty1s
Copy link
Member

@rusty1s rusty1s commented May 30, 2022

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented May 30, 2022

Codecov Report

Merging #51 (6bd7ed1) into master (45aafe6) will decrease coverage by 3.73%.
The diff coverage is 28.57%.

@@            Coverage Diff             @@
##           master      #51      +/-   ##
==========================================
- Coverage   94.42%   90.68%   -3.74%     
==========================================
  Files          12       13       +1     
  Lines         233      247      +14     
==========================================
+ Hits          220      224       +4     
- Misses         13       23      +10     
Impacted Files Coverage Δ
pyg_lib/csrc/ops/matmul.cpp 28.57% <28.57%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 45aafe6...6bd7ed1. Read the comment docs.

@rusty1s rusty1s changed the title [WIP] cutlass integration + segment_matmul implementation cutlass integration + segment_matmul implementation Jun 15, 2022
@rusty1s rusty1s requested review from puririshi98 and a team June 15, 2022 13:45
@rusty1s
Copy link
Member Author

rusty1s commented Jun 15, 2022

@pyg-team/nvidia-team This PR is now ready to review.

@rusty1s rusty1s requested review from yaoyaowd and ZenoTan June 15, 2022 13:50
Copy link

@teju85 teju85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really a great example showing cutlass integration. Nice job @rusty1s ! Do you have an example where the pyg_lib.segment.grouped_matmul is actually getting called in a training script?

Copy link
Contributor

@puririshi98 puririshi98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

pyg_lib/csrc/segment/cuda/matmul_kernel.cu Outdated Show resolved Hide resolved
@@ -0,0 +1,41 @@
#include "matmul.h"

#include <ATen/core/dispatch/Dispatcher.h>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

@rusty1s
Copy link
Member Author

rusty1s commented Jun 15, 2022

Thanks @teju85. I will work on backward implementation and PyG integration next. Can share an example by then!

@hwu36
Copy link

hwu36 commented Jun 16, 2022

Haicheng from nvidia cutlass. LGTM. Thank you. BTW, we are improving group gemm now.

@rusty1s
Copy link
Member Author

rusty1s commented Jun 19, 2022

@hwu36 Thanks! Please ping me if you make any improvements :)

@hwu36
Copy link

hwu36 commented Jun 21, 2022

@hwu36 Thanks! Please ping me if you make any improvements :)

@jackkosaian just fixed occupancy calculation in NVIDIA/cutlass#532 . This number is used to calculate the number of threadblocks to launch group gemm. I know you hard coded this number now so you are not affected.

@jackkosaian is going to further improve group gemm in the summer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants