-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy of "Dynamic Composable Kernel Initial Integration #1071" #1097
Conversation
git-subtree-dir: src/composable_kernel git-subtree-split: f6edda6119ebbb237dfa6270797b34f960d7b190
…le_kernel_init_integration_v3
…init_integration_v3
Co-authored-by: JD <Jehandad.Khan@amd.com>
… for invalid element
…init_integration_v3
…init_integration_v3
646fcc268 Merge pull request #47 from ROCmSoftwarePlatform/develop 6014185ac [Bug Fix] GridwiseGemm_bk0mk1_bk0nk1_mn_xdlops_v2r4 loop issue (#44) 3e9113707 Merge pull request #46 from ROCmSoftwarePlatform/miopen_downstream_all 211dae822 Merge branch 'develop' into miopen_downstream_all 5890e3007 [Composable Kernel] update develop branch code to ck_upstream d5297abae fix bug in gridwise gemm xdlops v2r3 (#45) 38a90b6ed Merge pull request #43 from ROCmSoftwarePlatform/develop c3018794b bug fix (#39) fd49ff808 add nchw atomic , nhwc and nhwc atomic method for backward weight (#30) b2dc55f82 [MIOpen Downstream] Fix Reduction Kernel (#34) b3e8d57d5 Tweak GEMM kernel (#38) 846f462bd Add VectorType support into StaticBuffer (#27) dfb80c4e3 [Enhancements] Several bugfixes and refactoring of dynamic generic reduction (#1156) 8557901d0 Merge pull request #1165 from ROCmSoftwarePlatform/develop f305bebdc Merge pull request #31 from ROCmSoftwarePlatform/miopen_downstream-dynamic_reduction_pr b725e3fc8 Merge remote-tracking branch 'origin/develop' into miopen_downstream-dynamic_reduction_pr 88833bd9a Merge pull request #32 from ROCmSoftwarePlatform/develop df0d68106 :Merge remote-tracking branch 'origin/develop' into CK_upstream f3acd2510 Add a version of Merge transform that use integerdivision and mod (#25) 19613902b GEMM driver and kernel (#29) 627d8ef35 Backward weight v4r4r2 with xdlops (#18) 10bb81106 Misc fixes (#24) 9e80cdceb [SWDEV-281541][MSRCHA-100] Implementation of Dynamic Generic Reduction (#1108) a7a758d8c GlobalAtomicAdd for fp32/int32 (#23) 9d3f634a3 Xdlops refactor fix (#22) c6f26bb48 magic division use __umulhi() (#19) 6fe3627a9 Composable kernel init integration v3 (#1097) a2ad6d353 refactor dynamic xdlops iGemm (#13) ba6f79a75 Added host_conv_wrw for verification (#15) git-subtree-dir: src/composable_kernel git-subtree-split: 646fcc268ede841a16cdaafb68aa64803d8390e1
{ | ||
KernelTimer timer; | ||
|
||
printf("%s: grid_dim {%d, %d, %d}, block_dim {%d, %d, %d} \n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⚓ This should be controlled by MIOPEN_LOG_LEVEL etc.
static constexpr auto I2 = Number<2>{}; | ||
static constexpr auto I3 = Number<3>{}; | ||
|
||
static constexpr index_t WaveSize = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
64 is invalid for gfx10+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@asroy @zjing14 @carlushuang could you take a look at this?
|
||
static constexpr auto xdlops_gemm = XdlopsGemm<float, MPerWave, NPerWave, K1>{}; | ||
|
||
static constexpr index_t WaveSize = 64; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same
Create a new PR for the PR that has been reverted.
#1071