[Operator][Backend] Add nvcc flags for faster math and update Attention schedule #221

hjjq · 2023-05-09T19:07:14Z

Make -ftz=true and prec-div=false default for all nvcc compiled kernels
Update Attention schedule template
Make repeat mapping explicit unroll by default when extent < 4
Fix erf test. Increase tolerance for pool test

Adding support for the operator `torch.as_tensor`, which was encountered in #221 Also added more tests for `torch.argmax, torch.argmin` as discussed in #234

…Tensor.new_full` (#238) Adding two operators `torch.Tensor.max` and `torch.Tensor.new_full` while attempting to compile models from TorchBench(mentioned in the comments under #221).

Adding support for the operator `torch.as_tensor`, which was encountered in #221 Also added more tests for `torch.argmax, torch.argmin` as discussed in #234

…Tensor.new_full` (#238) Adding two operators `torch.Tensor.max` and `torch.Tensor.new_full` while attempting to compile models from TorchBench(mentioned in the comments under #221).

Adding support for the operator `torch.as_tensor`, which was encountered in #221 Also added more tests for `torch.argmax, torch.argmin` as discussed in #234

…Tensor.new_full` (#238) Adding two operators `torch.Tensor.max` and `torch.Tensor.new_full` while attempting to compile models from TorchBench(mentioned in the comments under #221).

hjjq added 19 commits May 9, 2023 14:12

wip

43c470f

.

9e1c729

shmem and layout

f5559fd

wrong results

9c4da9d

.

0fcf509

2matmul correct

5bce6f6

sm correct

531ca05

double buffering for k

30aab17

double buffering for v

2366c81

expand tuning space

90105c2

Make explicit unroll default in repeat

8c2156d

add block k tune space

dd793a2

Make use_fast_math default compile option

dc33cf8

fix smem_v layout

44fd077

format and lint

a23dd4d

Update attn_mask

c94829b

remove use_fast_math from passcontext

694133f

Increase error tolerance for pool ops

ba776f6

Adjust math compiler options; Fix erf test.

0cd936d

hjjq changed the title ~~[Operator] Add fast_math and update Attention schedule~~ [Operator] Add nvcc flags for faster math and update Attention schedule May 11, 2023

format

2e54a15

hjjq changed the title ~~[Operator] Add nvcc flags for faster math and update Attention schedule~~ [Operator][Backend] Add nvcc flags for faster math and update Attention schedule May 11, 2023

hjjq merged commit 971bd01 into hidet-org:main May 11, 2023

vadiklyutiy pushed a commit that referenced this pull request Jul 22, 2024

[Operators] Registering torch.as_tensor (#235)

4add4b9

Adding support for the operator `torch.as_tensor`, which was encountered in #221 Also added more tests for `torch.argmax, torch.argmin` as discussed in #234

vadiklyutiy pushed a commit that referenced this pull request Jul 23, 2024

[Operators] Registering torch.as_tensor (#235)

540367b

Adding support for the operator `torch.as_tensor`, which was encountered in #221 Also added more tests for `torch.argmax, torch.argmin` as discussed in #234

vadiklyutiy pushed a commit that referenced this pull request Dec 26, 2024

[Operators] Registering torch.as_tensor (#235)

ac87ccb

Adding support for the operator `torch.as_tensor`, which was encountered in #221 Also added more tests for `torch.argmax, torch.argmin` as discussed in #234

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Operator][Backend] Add nvcc flags for faster math and update Attention schedule #221

[Operator][Backend] Add nvcc flags for faster math and update Attention schedule #221

hjjq commented May 9, 2023 •

edited

Loading

[Operator][Backend] Add nvcc flags for faster math and update Attention schedule #221

[Operator][Backend] Add nvcc flags for faster math and update Attention schedule #221

Conversation

hjjq commented May 9, 2023 • edited Loading

hjjq commented May 9, 2023 •

edited

Loading