[Operator] Adding CPU support for matrix multiplication #251

BolinSNLHM · 2023-05-27T03:33:07Z

No description provided.

…ition

cleanup deleting deleting

yaoyaoding

Thanks @BolinSNLHM! Nice to see that hidet is getting to have better support for cpu backend.

As your first PR, it looks great! I left some comments. Please also add a test case in hidet/tests/operators/test_matmul.py to test your new operator.

.gitignore

python/hidet/backend/build.py

python/hidet/backend/codegen.py

python/hidet/ffi/runtime_api.py

python/hidet/graph/ops/__init__.py

python/hidet/lang/__init__.py

yaoyaoding · 2023-05-27T16:14:13Z

python/hidet/graph/ops/definitions/matmul/matmul_f32_x86.py

+            def matmul_kernel_x86(a_ptr: ~float32, b_ptr: ~float32, c_ptr: ~float32):
+                a = as_tensor_pointer(a_ptr, dtype=float32, shape=[m_size, k_size])
+                b = as_tensor_pointer(b_ptr, dtype=float32, shape=[k_size, n_size])
+                c = as_tensor_pointer(c_ptr, dtype=float32, shape=[m_size, n_size])
+                mbs = (m_size + block_m - 1) // block_m
+                nbs = (n_size + block_n - 1) // block_n
+                kbs = (k_size + block_k - 1) // block_k


If we declare kernel function in this way (use pointer as input instead of directly declare the input tensor), we will not support prologue/epilogue fusion. In this case, we should override the allow_prologue(...) and allow_epilogue(...) to return False. A better way is to disable prologue fusion and enable epilogue fusion, and decalre c as the tensor directly in the parameter list.

Co-authored-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>

yaoyaoding · 2023-05-28T01:50:29Z

Thanks @BolinSNLHM !

Added advanced tensor indexing. I had to create a new task called "AdvancedIndexingTask". Otherwise it will not work with dynamic shapes. --------- Co-authored-by: Zhumakhan <nazirzhumakhan@gmail,.com>

BolinSNLHM added 30 commits April 25, 2023 11:32

now remember to backup...

8d0aae4

...

8f8bcca

change 4x4 kernel to avx intrinsics

3b8f9c1

added some type info

7536d30

commit before changing the compilation command

a4ef3e9

now can compile with avx intrinsics

088970c

added 32x8 primitives for CPU

0d406a2

added O3 compiler option

9916989

...

edc5e67

added more primitives

9a0aa6a

...

db2c683

slight modification of opt88 file

33b4451

added 32x8 imports where necessary

1ad9e7d

modified two scratch files

46f1d63

five2: quite some speedup compared to how little has been down in add…

3d69a5f

…ition

..... fixed dumb error

1302698

..

f799531

8x8 kernel: efficiency improved again

b93b408

reordering: some improvements

3bbc4bd

reordering loop gets a slight boost

b053dc5

working on packing: back up midway

9ffa73f

commented out redundant codes

b29d61a

a version of packing that does not yield much benefit...

613e3e2

...

a1a6c5e

fix conflicts

035dca8

resolved conflict

3be1845

......

0372647

working on packing B: some bugs for now:

6c67af0

still hasn't figured out packing of B... move to using pointer?

fb3ca73

first version of packing works?

4982ddf

BolinSNLHM added 17 commits May 23, 2023 18:15

Merge branch 'main' of github.com:BolinSNLHM/hidet into main

1be5c12

Merge branch 'main' into bolin

888a285

.

0b3e45a

changed codegen to use dynamic

54cd1b6

I should try smaller blocks?

4424c7d

still something wrong with packing with pointer arithmetics...

087eae1

.

ef36d60

.

322a082

.

9b46a2d

deleting

7a94c2d

deleting

e3210ab

cleanup

2af5bbf

lint

df0158f

lint

3904a12

cleanup deleting deleting

..

fcbb094

Merge branch 'cpu-matmul' of github.com:BolinSNLHM/hidet into cpu-matmul

7b63a91

.

8bb26e6

yaoyaoding requested changes May 27, 2023

View reviewed changes

BolinSNLHM and others added 5 commits May 27, 2023 13:06

Update .gitignore

a86907e

Co-authored-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>

Update python/hidet/backend/build.py

d6150a6

Co-authored-by: Yaoyao Ding <dingyaoyao.cs@gmail.com>

addressed changes + test case

ca6e382

re-arranged test order so all tests passed....

6ebb52f

forgot to run format/lint

1813af8

BolinSNLHM requested a review from yaoyaoding May 27, 2023 23:15

yaoyaoding approved these changes May 28, 2023

View reviewed changes

yaoyaoding merged commit 3c6579e into hidet-org:main May 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Operator] Adding CPU support for matrix multiplication #251

[Operator] Adding CPU support for matrix multiplication #251

BolinSNLHM commented May 27, 2023

yaoyaoding left a comment

yaoyaoding May 27, 2023

yaoyaoding commented May 28, 2023

[Operator] Adding CPU support for matrix multiplication #251

[Operator] Adding CPU support for matrix multiplication #251

Conversation

BolinSNLHM commented May 27, 2023

yaoyaoding left a comment

Choose a reason for hiding this comment

yaoyaoding May 27, 2023

Choose a reason for hiding this comment

yaoyaoding commented May 28, 2023