[GPU] Conv and Gemm fix #2290

dyoussif · 2024-12-18T23:46:09Z

Fix data type for conv zp mask; set to s16 instead of s32. This is to avoid invalid data type combination for the mad instruction,mad dst:d src0:d src1:d src2:d.

Failing case

--mode-modifier=P --conv --engine=gpu --skip-impl=ref --allow-enum-tags-only=false --check-ref-impl=true -- dt=u8:s8:u8 --attr-scales=wei:per_oc --attr-zero-points=src:per_dim_1 --attr-post-ops=hardswish:0.271:0.314+linear:0.271:0.314 g240mb32ic240ih28oc240oh14kh3sh2ph0n"f191c263e53dbb3ce0c02a13f311a72a*1"

Fix gemm hang for the following case on Xe2:

--matmul --engine=gpu --dt=f16:s4:f16 --wtag=acb --attr-scales=wei:per_ocic:f16:128x1 --attr-zero-points=wei:per_ocic:u4:128x1 --attr-fpmath=f16:true --skip-impl=ref 3x96x512:3x512x64

Seems lookahead should match reqLoad

dyoussif · 2024-12-18T23:47:44Z

make test
disable device_cpu
enable device_gpu
disable benchdnn_all
enable benchdnn_matmul

petercad · 2024-12-19T15:19:43Z

src/gpu/intel/jit/gemm/generator/pieces/k_loop.cxx

+    auto reqLoadAq = every(kaq_load) | lookahead(kaq_load);
+    auto reqLoadBq = every(kbq_load) | lookahead(kbq_load);


@dyoussif, similar to the lines below, the lookahead should depend on whether we are repacking this data. If so -- yes, this patch is right. Otherwise, the original code is what's needed; with this patch we would load Aq/Bq too early if the group size exceeds the load chunk size for A/B.

dyoussif added 2 commits December 18, 2024 19:29

xe: jit: conv: cast mask to s16 before mad

183c6d7

xe: jit: gemm: align kxq lookahead with kxq_load

39d01dc

dyoussif requested a review from a team as a code owner December 18, 2024 23:46

github-actions bot added the platform:gpu-intel Codeowner: @oneapi-src/onednn-gpu-intel label Dec 18, 2024

echeresh approved these changes Dec 19, 2024

View reviewed changes

petercad reviewed Dec 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[GPU] Conv and Gemm fix #2290

[GPU] Conv and Gemm fix #2290

dyoussif commented Dec 18, 2024 •

edited

Loading

dyoussif commented Dec 18, 2024

petercad Dec 19, 2024

		auto reqLoadAq = every(kaq_load) \| lookahead(kaq_load);
		auto reqLoadBq = every(kbq_load) \| lookahead(kbq_load);

[GPU] Conv and Gemm fix #2290

Are you sure you want to change the base?

[GPU] Conv and Gemm fix #2290

Conversation

dyoussif commented Dec 18, 2024 • edited Loading

dyoussif commented Dec 18, 2024

petercad Dec 19, 2024

Choose a reason for hiding this comment

dyoussif commented Dec 18, 2024 •

edited

Loading