Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync CentML/hidet -> hidet-org/hidet #476

Merged
merged 132 commits into from
Dec 20, 2024
Merged
Changes from 1 commit
Commits
Show all changes
132 commits
Select commit Hold shift + click to select a range
99de55a
Sync
vadiklyutiy Jul 24, 2024
39287f4
lint
vadiklyutiy Jul 27, 2024
f2fc272
[Operators] Extend the functionality of `einsum` to support `Ellipsis…
BolinSNLHM Jul 28, 2024
1431033
[BUG] Fix bug in `normalize_launch_dims()` (#381)
vadiklyutiy Jul 30, 2024
5e5e5a0
Fix float return when limited by memory (#389)
maxyanghu Jul 30, 2024
045caed
[CI] Adding successfully compiled vision models to the tests/benchmar…
BolinSNLHM Jul 30, 2024
efe3569
[Operators] Adding support for the `torch.nn.EmbeddingBag` (#378)
BolinSNLHM Jul 30, 2024
9d50fa2
[BUILD] Several changes in wheel building (#392)
vadiklyutiy Jul 31, 2024
7746a3a
[DEBUG] Save `Task` pickle in translations cache (#380)
vadiklyutiy Aug 1, 2024
f7563e6
make llama2 work with all ttransformers versions (#385)
zhumakhan Aug 1, 2024
15ec205
[Bug] Cast dtypes in hidet.where when mismatch (#386)
zhumakhan Aug 2, 2024
6500c3c
workaround for gpt-j (#395)
zhumakhan Aug 2, 2024
d6dc4c0
[Bug] Fix hidet.ops.gather, add torch.sign torch.ceil. Disable torch.…
zhumakhan Aug 8, 2024
39186c3
Adding accruacy check for huggingface LLMs in Regression (#368)
zhumakhan Aug 8, 2024
0d2e41a
[Tests] Adding tests for math primitives (#412)
BolinSNLHM Aug 12, 2024
838b208
[CuTe] fix longformer (#411)
xiaocenxiaocen Aug 12, 2024
c4bf950
[Bug][Enhancement] Correct the behavior of non-parallel build when op…
maxyanghu Aug 12, 2024
9ac0516
[Bug] Fixing longformer compilation (#403)
zhumakhan Aug 12, 2024
b1f6177
[Fix] Support writing subbyte data to global memory (#415)
yaoyaoding Aug 13, 2024
850ac79
[Update] Updating torch docker image from 24.04 to 24.07 (#418)
zhumakhan Aug 14, 2024
93a7887
[Ir][Primitives] add exp2 (#410)
xiaocenxiaocen Aug 14, 2024
d549803
[Primitives] Add CUDA primitives: prmt, lop3, f16x2 sub and fma, and …
yaoyaoding Aug 15, 2024
b6e06b7
[Fix] fixed torch.pow (#420)
zhumakhan Aug 15, 2024
0b8dbce
[Operators] Adding support for the method `torch.Tensor.scatter_add_`…
BolinSNLHM Aug 16, 2024
bcce954
[Utility] Add ncu and nsys test utilities (#413)
yaoyaoding Aug 19, 2024
dd7dab5
[Bug] fixing regression (#422)
zhumakhan Aug 19, 2024
f4c2ff9
[PERF] Allow prologue fusion for `reduce` op (#426)
vadiklyutiy Aug 21, 2024
8faf803
[Version] Update 0.4.0 -> 0.5.0.dev in `setup.py` (#433)
vadiklyutiy Aug 27, 2024
49780eb
[PERF] Specialize pow(x,2) as x*x. llama-7B (#434)
vadiklyutiy Aug 27, 2024
081c84d
[BUG] Fixing a bug triggered while compiling in-place operator `torch…
BolinSNLHM Aug 27, 2024
c6f0e66
[BUG] Fixing an error triggered from the `conv_channel_last_pass` whi…
BolinSNLHM Aug 30, 2024
e807c3d
[Graph][Ops] disable cublas matmul for parallel k (#431)
xiaocenxiaocen Sep 4, 2024
3420571
[Scripts] Add scripts of our wheel server (#439)
yaoyaoding Sep 4, 2024
123cc00
[Bug] Fixing the `ValueError` triggered while compiling the model `dl…
BolinSNLHM Sep 5, 2024
9ea795d
[PERF] Introduce the new IR optimization Pass that spatial(1,47) -> s…
vadiklyutiy Sep 9, 2024
865500e
[BUG] Fixing errors encountered while compiling `detectron2_fcos_r_50…
BolinSNLHM Sep 10, 2024
c590ffb
[Ir][Primitives] fix #436 via adding missing instructions (#440)
xiaocenxiaocen Sep 10, 2024
32d7bf9
[BUG] Fixing another error encountered while compiling `detectron2_fc…
BolinSNLHM Sep 11, 2024
d8ecff2
[Operators] Adding support for `torch.nn.GLU` module (#461)
BolinSNLHM Sep 12, 2024
a7758db
[BUG] Fix `NotImpelementedError` encountered while compiling the mode…
BolinSNLHM Sep 14, 2024
63eca04
[CI] Print stderr in `run_tests.py` (#443)
vadiklyutiy Sep 16, 2024
4d3c48f
[Dynamic][Enhancement] Convert div and mod including symbolvars to fa…
maxyanghu Sep 18, 2024
e251184
Revert "[Dynamic][Enhancement] Convert div and mod including symbolva…
maxyanghu Sep 18, 2024
9308c9f
Added more llms to Regression test (#432)
zhumakhan Sep 19, 2024
cb07596
[PERF] Indexes optimization (#458)
vadiklyutiy Sep 20, 2024
3c8c922
[BUG] Fixing memory issue encountered while compiling the model `sam`…
BolinSNLHM Sep 20, 2024
55e29e9
[BUG] Fix `ZeroDivisionError` triggered wihtin the function `parallel…
BolinSNLHM Sep 22, 2024
f9fc705
[BUG] Fix `ValueError` caused by different operand data types in `if_…
BolinSNLHM Sep 23, 2024
c76d7e2
[TOOLS] Attached hash values to function signature in source.cu (#459)
ZichuWu Sep 23, 2024
b46f65b
[BUG] Support concat empty tensors (#475)
ZichuWu Sep 23, 2024
2bf6e68
remove mpt-7b due to accuracy failure (#477)
zhumakhan Sep 24, 2024
0870797
Fix masked attention by using fp32 accumulate on first matmul (q and …
zhumakhan Sep 25, 2024
dd52728
Fixed the format change on the new transformers version (#482)
ZichuWu Sep 26, 2024
f2b1fb8
[Bug] Resolved multi-threading conflict with save_lower_ir() (#480)
ZichuWu Sep 26, 2024
eca3695
[PERF] Сontinue indexes optimisations (#473)
vadiklyutiy Sep 27, 2024
e83a072
bug fix
vadiklyutiy Sep 27, 2024
1dc3974
Revert accidental commit (#484)
vadiklyutiy Sep 27, 2024
66492e6
Empty commit to test out the branch protection rule.
wangshangsam Sep 27, 2024
e69ec9d
[Dynamic][Enhancement] Convert div and mod including symbolvars to fa…
maxyanghu Sep 27, 2024
2ce215c
[TOOLS] Task benchmark utilities (#479)
vadiklyutiy Sep 29, 2024
4f3dccc
[Bug] Rule based simplifier. Fix incorrect rule e/c1/c2 -> e/(c1*c2) …
vadiklyutiy Sep 30, 2024
36373ff
[IR] Bound check for task mapping worker (#483)
vadiklyutiy Oct 2, 2024
8d7eb10
[Enhancement] Causal attention with fp32 accumulator (#481)
zhumakhan Oct 4, 2024
48d409b
[CI] Update the set of Regression tests (#493)
vadiklyutiy Oct 7, 2024
5bada76
[PERF] Fix for indexes optimization (#488)
vadiklyutiy Oct 8, 2024
8fdf896
[Hidet Script] Import externally defined function automatically (#503)
yaoyaoding Oct 16, 2024
5ccd0cc
[CI] Make test and publish workflows use built wheel on tests (#492)
c-fteixeira Oct 18, 2024
798f551
[Operators] Support matmul with NT layout (#496)
BolinSNLHM Oct 18, 2024
9e8eeca
[Operators] Support `bfloat16` data type in `matmul` operator (#511)
BolinSNLHM Oct 22, 2024
0657eab
[BUG] Fix NT matmul corner case where `n` or `k` dimension is odd (#513)
BolinSNLHM Oct 22, 2024
6764c08
wgmma instruction support and test for f16 input … (#499)
kjiang170 Oct 22, 2024
abe1e53
[PERF] Rewrite softmax (#516)
vadiklyutiy Oct 23, 2024
10e7c3b
[Bug] Fix the incorrect result after merging changes related to `matm…
BolinSNLHM Oct 23, 2024
80b3bd3
refactor wgmma (#521)
kjiang170 Oct 23, 2024
be98acf
[OP] Support of `logsoftmax` (#517)
vadiklyutiy Oct 24, 2024
fde166c
[BUG] Fix distilbert by changing variables names in ops.where (#512)
zhumakhan Oct 24, 2024
daad151
[Enhancement] Save running time by using symbolic_run to replace asyn…
ZichuWu Oct 24, 2024
a455026
[Operators] `bfloat16` data type support for attention operators (#524)
BolinSNLHM Oct 25, 2024
269a307
f16 rest options supported and tested (#527)
kjiang170 Oct 30, 2024
3965d85
[COMPTIME] Added support for run_torch for the rest of transform oper…
ZichuWu Oct 31, 2024
9450dc5
Add more shapes to reduce op in regression (#534)
zhumakhan Nov 3, 2024
e824695
[Bug] ‘uint32_t’ was not declared in this scope in CI build-wheel for…
ZichuWu Nov 5, 2024
75ad7de
wgmma bf16 support (#531)
kjiang170 Nov 6, 2024
dfb43fe
[OPTIONS] Set mma as default in PassContext() (#530)
ZichuWu Nov 12, 2024
e7fd1b9
[Bug] Fix out of memory error occurred while running `llama-2-7b` (#547)
BolinSNLHM Nov 14, 2024
c17766a
[OPTIONS] Clean Huggingface tokens option (#561)
ZichuWu Nov 15, 2024
8eb2501
[CI] Tests Workflow. Add manual trigger of tests on different gpu typ…
c-fteixeira Nov 17, 2024
412dac7
[CI] Shorten build-docs run time (#565)
ZichuWu Nov 18, 2024
ceb15fb
[CI] Move import torch inside run_torch() (#570)
ZichuWu Nov 18, 2024
3a06436
python3.8 -> python3.9 (#558)
vadiklyutiy Nov 18, 2024
93c88d2
[Operators] Allow NT `matmul` layout for `bfloat16` data type (#562)
BolinSNLHM Nov 19, 2024
df79733
fix test_wgmma.py error for illegal warp address (#588)
kjiang170 Nov 19, 2024
9cc1847
[OPTIONS] Remove unnecessary parallel_k (#572)
ZichuWu Nov 20, 2024
ce8bc4f
[CI] add new github actions workflow to manually build and push to in…
xinli-centml Nov 20, 2024
1a84379
[Tests] Change float16 to bfloat16 for tests/apps (#589)
ZichuWu Nov 21, 2024
2ee866a
[Tests] Adjust test cases for tests/utils for bfloat16. (#597)
ZichuWu Nov 21, 2024
2df6889
[Tests] Added bfloat16 test cases for tests/cuda (#590)
ZichuWu Nov 22, 2024
90b3dfe
[BUG] Fix incorrect converting fxgraph to hidet's flow graph + expand…
vadiklyutiy Nov 22, 2024
38f798b
[Tests] Adjust test cases for tests/unit-tests for bfloat16. (#596)
ZichuWu Nov 23, 2024
4db5eb3
[CI] Exclude tests/unit_tests/test_dynamic_shape.py::test_attention[c…
vadiklyutiy Nov 24, 2024
f20838a
Kaihang/wgmma tf32 u8 i8 support (#549)
kjiang170 Nov 25, 2024
7cbc208
[Fix] Fixing a minor mistake encountered while adapting test cases fo…
BolinSNLHM Nov 25, 2024
8d6cb44
Use one global cuda workspace for all the CompiledGraph (#603)
maxyanghu Nov 25, 2024
49014ff
[Tests] Adjust test cases for tests/models for bfloat16. (#595)
ZichuWu Nov 26, 2024
6921a9b
[Tests] Adapt tests/ir for bfloat16 test cases (#593)
ZichuWu Nov 26, 2024
4db73ea
[Tests] Adapt tests/frontends to bfloat16 (#592)
ZichuWu Nov 26, 2024
3289ed7
[Tests] Adapt tests/lang for bfloat16 test cases (#594)
ZichuWu Nov 26, 2024
613327e
[CI] Turn off search space 2 for tests/lang (#617)
ZichuWu Nov 28, 2024
ded8809
[BUG] Fix bugs in shared map implementation (#608)
vadiklyutiy Nov 28, 2024
c69a298
[torchAPI] Inherit cuda stream from torch (#618)
vadiklyutiy Nov 29, 2024
0fa16fc
[DISTRIBUTED] Support `all_reduce` in `torch.compile` mode (#612)
vadiklyutiy Dec 2, 2024
48f0dfa
[Tests] Adapt tests/operators for bfloat16 (#615)
ZichuWu Dec 2, 2024
ec25a0b
[PERF] Support bf16 in one more place (#623)
vadiklyutiy Dec 3, 2024
c4e13e5
[CI]Fix small typoes for building and publishing to internal Hidet PY…
xinli-centml Dec 8, 2024
5c8cfc6
[BUG] Fix torch2.5 OoM issue (#609)
zhumakhan Dec 9, 2024
097ce2f
Revert "[BUG] Fix torch2.5 OoM issue" (#635)
zhumakhan Dec 9, 2024
9c7ca36
[BUG] Fix torch2.5 OoM and docs build fix (#637)
zhumakhan Dec 9, 2024
7b21b7b
[COMPTIME] Hot start speedup (#625)
vadiklyutiy Dec 10, 2024
a5654c9
[Bug] Parallel compilation sync (#616)
ZichuWu Dec 10, 2024
be933f3
Adapt to bfloat16 where necessary (#624)
ZichuWu Dec 10, 2024
bb7396f
[PERF] Default value for parallel_k is 'disabled' (#634)
vadiklyutiy Dec 11, 2024
62057b8
Hexcute base branch (All related PRs will be merged into this base PR…
xiaocenxiaocen Dec 11, 2024
4fcea21
[BUG] fix attach hash to signature (#638)
xiaocenxiaocen Dec 11, 2024
430709e
[IR] Add support for `swizzle`, `interleave` and `l2Promotion` in ten…
BolinSNLHM Dec 12, 2024
21a5905
[BUG] VLLM (and DMWL) compile with hidet backend (#647)
zhumakhan Dec 13, 2024
d2a94e0
matmul_f16 with wgmma (#627)
kjiang170 Dec 18, 2024
5591f07
[BUG] A number of fixes for vllm's TP (#651)
vadiklyutiy Dec 18, 2024
53ae60f
[BUG] Add comp server requirements (#661)
vadiklyutiy Dec 19, 2024
4180123
resolve conflicts
vadiklyutiy Jul 24, 2024
06955a9
[Dependency] Remove the version restriction of transformers and diffu…
yaoyaoding Nov 13, 2024
e38da75
lint
vadiklyutiy Dec 18, 2024
d2dd415
Temporary switch test workflow to old runners because new one are not…
vadiklyutiy Dec 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
[CI] Print stderr in run_tests.py (#443)
For `run_tests.py` (Regression) output subtests' `stderr`. "Compiling..." from hidet go there.

[Example of output in Regression
logs](https://github.com/CentML/hidet/actions/runs/10625143157/job/29454729970)
vadiklyutiy committed Dec 19, 2024
commit 63eca0478419cc7b30a6507963c83253749c4f35
10 changes: 7 additions & 3 deletions tests/benchmarks/run_tests.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
import os
import sys
import json
import subprocess
import pathlib
@@ -11,13 +12,16 @@
def run_command(cmd):
cmd = " ".join(cmd)
print("Running command: " + cmd)
sys.stdout.flush()
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True, shell=True)

for line in iter(process.stderr.readline, ''):
sys.stderr.write(line)
sys.stderr.flush()

stdout, stderr = process.communicate()
ret = process.returncode
if ret:
print('STDERR:')
for line in stderr:
print(line, end='')
raise RuntimeError(f'Command {cmd} failed with return code {ret}.')
return stdout