Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYNC] Sync CentML -> hidet-org #455

Merged
merged 98 commits into from
Jul 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
5ce601d
[FFI] Refactor CompiledFunction interface with ctypes (#79)
jacklee1792 Apr 3, 2024
396ad09
[App] Complete UNet Definition (#99)
KTong821 Apr 5, 2024
08eef12
[Graph] Stable Diffusion Rope Module (#95)
KTong821 Apr 5, 2024
860fc15
[CI] Add clang-format script/action (#120)
jacklee1792 Apr 5, 2024
68a3fe4
[Graph] Add major UNet building components (#97)
KTong821 Apr 5, 2024
9033c10
[Models] Support for tokenizers in C++ runtime (#69)
jacklee1792 Apr 6, 2024
0067ce6
[LLM App] LLM Application initial support (#121)
yaoyaoding Apr 8, 2024
b63caf9
Stable Diffusion App Infra (#103)
KTong821 Apr 11, 2024
e57a9bb
[Graph] Enhance forward debug instrument (#130)
jacklee1792 Apr 15, 2024
065a82d
[COMPTIME] Parallelize `apply_prologue_epilog`(fusion) and IR generat…
vadiklyutiy Apr 18, 2024
a3b7dab
[Ir][Primitives] fix __shfl_xor_sync (#155)
xiaocenxiaocen Apr 19, 2024
7d43780
[Operator] Register missing math primitives (#134)
jacklee1792 Apr 19, 2024
bbf0530
[Operator] Fix symbolic broadcasting (#131)
jacklee1792 Apr 19, 2024
aef8220
[COMPTIME] Specialize `Constant._binary()` for compilation speedup (#…
vadiklyutiy Apr 20, 2024
fd63e52
[App] Minor bugfixes for LLM app (#157)
jacklee1792 Apr 22, 2024
527653a
[IR] [Primitives] Add thread cluster on sm_90 (#145)
KTong821 Apr 22, 2024
4f80460
Gemma+torch.compile fixes(autocast, rtruediv) (#159)
vadiklyutiy Apr 22, 2024
096bfcb
[Operator] triu + tril operators (#146)
jacklee1792 Apr 22, 2024
e4a0386
[App] Fix LLM app tracing (#158)
jacklee1792 Apr 23, 2024
43ec055
[Fixbug] Set _is_exiting correctly (#163)
jacklee1792 Apr 24, 2024
76bf2f6
[App] Cleanup SD Implementation (#143)
KTong821 Apr 24, 2024
39cc879
Support Transpose2D (#77)
zhiwei-fang Apr 25, 2024
982b552
[Models] Gemma implementation (#132)
jacklee1792 Apr 26, 2024
b75e5d8
Revive dynamic shape support with `torch.compile` (#162)
vadiklyutiy Apr 26, 2024
742a6b6
[App] ResNet Compiled App (2/2) - Pipeline (#165)
KTong821 Apr 27, 2024
cafaeed
[OPS] Add `torch.Tensor.sin`, `torch.Tensor.cos` and `torch._C._nn.pa…
vadiklyutiy Apr 30, 2024
3131abe
[Ir][Primitives] add hopper instructions (#83)
xiaocenxiaocen May 3, 2024
b938f74
[App] SyncLLM + AsyncLLM interface (#166)
jacklee1792 May 6, 2024
bee8c2a
optimize grouping method (#174)
maxyanghu May 7, 2024
f1e5162
[COMPTIME] Add `chunksize` arg to `pool.imap` (#178)
vadiklyutiy May 8, 2024
dbc130a
[COMPTIME] Reduce the number of `fork` in `multithreading.Pool` (#180)
vadiklyutiy May 9, 2024
459cab4
[Bug] Fix number of groups under certain case (#181)
maxyanghu May 10, 2024
067b155
[Operator] Adding meshgrid operator support (#183)
BolinSNLHM May 12, 2024
c7d827a
[Fix] Remove YOLOv7 from tests/benchmarks/run_configs.json (#187)
BolinSNLHM May 12, 2024
70c15b2
__getitem__ with N dimensional index tensor (#185)
zhumakhan May 14, 2024
8c17b1c
feat: parallel job execution for tests (#147)
c-fteixeira May 14, 2024
ebd3d85
[OPTIONS] Don't create hidet config if it's not exist (#203)
vadiklyutiy May 21, 2024
ec93d68
[BUG] Clear `_job_queue` in `parallel_imap` for tests (#204)
vadiklyutiy May 21, 2024
7f11d07
[Ir] add utilities for CuTe (#107)
xiaocenxiaocen May 24, 2024
6adc3d1
[BUGFIX] Init cuda info before run forks for IR generation (#208)
vadiklyutiy May 24, 2024
0881032
[BENCHs] Refactor transformers tests. Add llama2, mistral, gemma, gpt…
vadiklyutiy May 27, 2024
f8bb7dc
Fix issues related to mistral model (#213)
zhumakhan May 28, 2024
156268b
steal_weight option fixes && fixes for mistral model (#209)
zhumakhan May 28, 2024
eda4e61
Xiaocenxiaocen/expose more ldst instructions (#216)
xiaocenxiaocen May 28, 2024
5c7bfe2
[Ir][CuTE] lower cute dialect (#109) (#230)
xiaocenxiaocen May 30, 2024
9d14cc1
[Operator] Registering `torch.Tensor.argmax` (#234)
BolinSNLHM May 31, 2024
4add4b9
[Operators] Registering `torch.as_tensor` (#235)
BolinSNLHM Jun 4, 2024
4d95978
Inherit `mode` argument from `torch.compile` and set corresponding op…
vadiklyutiy Jun 5, 2024
d91446b
Delete options `use_fp16` and `use_fp16_reduction` (#239)
vadiklyutiy Jun 5, 2024
99b12b3
[Operator] Adding support to operators `torch.Tensor.max` and `torch.…
BolinSNLHM Jun 5, 2024
66bf235
[Operator] Adding `torch.Tensor.expand_as` support (#250)
BolinSNLHM Jun 6, 2024
16eb4a2
[Operator] Adding support for `torch.Tensor.div` (#249)
BolinSNLHM Jun 6, 2024
4be2fd9
[Operator] Registering torch.sigmoid_ (#258)
BolinSNLHM Jun 7, 2024
836e3ac
[OPTIONS] Use Attention by default (#261)
vadiklyutiy Jun 10, 2024
d0877d5
[Operator] Registering `torch.Tensor.copy_` (#259)
BolinSNLHM Jun 10, 2024
8add3b7
[PERF] Increase accuracy of pick up the best candidate (#269)
vadiklyutiy Jun 15, 2024
a94c0c6
[Operator] Adding support to `repeat_interleave` and more (#270)
BolinSNLHM Jun 19, 2024
4706425
[Operator] Added advanced tensor indexing (#251)
zhumakhan Jun 19, 2024
58c351c
[Fix] Handling `getitem` special case (#281)
BolinSNLHM Jun 20, 2024
6ddca7f
[SCRIPTS] Adopt our scripts to use `mode` from `torch.compile` (#274)
vadiklyutiy Jun 20, 2024
33bb7a3
fix: handles race condition on parallel config directory creation (#285)
c-fteixeira Jun 20, 2024
a446ba8
adding support for torch.any (#277)
zhumakhan Jun 20, 2024
39b44ab
[Graph][Ops] fp32 accumulation for matmul_f16 (#268)
xiaocenxiaocen Jun 20, 2024
a87de77
[Operator] torch.any (#287)
zhumakhan Jun 21, 2024
ed0c5e6
[Fix] Handling `Tensor.to(..., device=....)` on symbolic tensors (#284)
BolinSNLHM Jun 24, 2024
6793a09
Removing constant tensors that are not needed after subgraph rewrite …
zhumakhan Jun 24, 2024
0e76d41
[Perf] support vectorized epilogue fusion (#220)
xiaocenxiaocen Jun 24, 2024
2e5b827
[Graph][Ops] fp32 accumulation for cute matmul (#292)
xiaocenxiaocen Jun 25, 2024
d3aa812
[BUG] when device is None, device_from_torch returns 'cpu' by default…
zhumakhan Jun 27, 2024
e1fb8b0
Setitem with tensor values. And Boolean type promotion (#290)
zhumakhan Jun 28, 2024
3a6b8f1
Increase batch size for bert to decrease fluctuations (#236)
vadiklyutiy Jun 28, 2024
bc868f6
[PERF] Reduce fixed overhead for model run (#310)
vadiklyutiy Jun 28, 2024
a6405ac
Handle dtype and device in hidet.ones_like op (#316)
zhumakhan Jul 2, 2024
3f413bf
[Operator] Extending the functionality support for `einsum` (#312)
BolinSNLHM Jul 2, 2024
61fe266
[Chore] replace copyrights with citations (#315)
xiaocenxiaocen Jul 2, 2024
aa2dccc
[Fix] Fix the bug in `tensor_expand` caused by attempting to modify `…
BolinSNLHM Jul 3, 2024
ff1bfc6
[Operators] Adding PyTorch operators encountered while compiling `DAL…
BolinSNLHM Jul 4, 2024
dd134d1
[Fix] Fixing a RuntimeError triggered by `tensor_reshape` function in…
BolinSNLHM Jul 8, 2024
afee434
[Fix] Handling hidet errors caused by device difference in `getitem` …
BolinSNLHM Jul 8, 2024
c7f3f61
[Fix] Fixing an error triggered by `ClampOp` (#329)
BolinSNLHM Jul 8, 2024
f839be6
[Operator] Adding `__ge__` method for the `Tensor` class (#330)
BolinSNLHM Jul 8, 2024
44e5162
[OPTIONS] Inherit `options` from `torch.compile()` (#260)
vadiklyutiy Jul 9, 2024
70d1bb2
[Operators] Adding support for `torch.nn.TransformerEncoder` (#327)
BolinSNLHM Jul 10, 2024
06ab1a0
[Operator] Adding support for `torch.Tensor.view_as` (#334)
BolinSNLHM Jul 10, 2024
4f738b7
[OPTIONS] Remove dynamo_config['search_space'] (#342)
vadiklyutiy Jul 11, 2024
7a1466a
[OPS] Dissallow in fxgraph not supported functions (#317)
vadiklyutiy Jul 11, 2024
ec3af9d
[BUG] Fixed search_space bug in `bench_op.py` (#348)
vadiklyutiy Jul 14, 2024
d2266d7
[Fix] Handling special cases in `setitem` regarding dtype and device …
BolinSNLHM Jul 15, 2024
7fbaa8d
[Fix] Fixing a bug in `register_methods` (#331)
BolinSNLHM Jul 15, 2024
97a0c98
[Fix] Added missing torch.multiply and torch.nn.functional.unfold ops…
zhumakhan Jul 16, 2024
32a2255
[Fix] type casting for attention mask from fp32 -> f16 (#323)
zhumakhan Jul 16, 2024
c6ce9fb
[CI] Promote nvidia docker container to version 24.4 (#354)
vadiklyutiy Jul 17, 2024
a5a5373
[PERF] Introduce add_hint_pass (#355)
vadiklyutiy Jul 17, 2024
3ab84b4
[Operators] Registering tensor methods whose PyTorch function equival…
BolinSNLHM Jul 17, 2024
eb67b6a
[PERF] Remote workaround for loops in `add_hints_pass` (#356)
vadiklyutiy Jul 18, 2024
7db8629
[Fix] Fixing an error triggered while compiling the `torch.nn.Upsampl…
BolinSNLHM Jul 18, 2024
b9551c4
[Operators] Adding `leaky_relu` support (#360)
BolinSNLHM Jul 19, 2024
f5f528f
[CI] Repeat start_instance (#361)
vadiklyutiy Jul 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM nvcr.io/nvidia/pytorch:23.10-py3
FROM nvcr.io/nvidia/pytorch:24.04-py3
ADD ./hidet /workspace/hidet
ADD ./models /workspace/models
WORKDIR /workspace
Expand All @@ -9,4 +9,4 @@ RUN pip install -r hidet/requirements.txt && \
WHEEL=$(find hidet/scripts/wheel/built_wheel -maxdepth 1 -name '*.whl') && \
pip install --no-deps --force-reinstall $WHEEL && \
pip install -e models && \
hidet cache clear --all
hidet cache clear --all
27 changes: 27 additions & 0 deletions .github/actions/setup-hidet/action.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: 'Setup Hidet'
description: 'Install dependencies, build and install wheel'
runs:
using: "composite"
steps:
- name: Install dependencies via pip
shell: bash
run: |
python -m pip install --upgrade pip
pip install torch torchvision torchaudio
pip install -r requirements.txt
pip install -r requirements-dev.txt

- name: Build hidet
shell: bash
run: |
bash scripts/wheel/build_wheel.sh
WHEEL=$(find ./scripts/wheel/built_wheel -maxdepth 1 -name '*.whl')
echo $WHEEL
echo "WHEEL_NAME=$WHEEL" >> $GITHUB_ENV

- name: Install hidet
shell: bash
env:
WHEEL_NAME: ${{ env.WHEEL_NAME }}
run: |
pip install --no-deps --force-reinstall $WHEEL_NAME
38 changes: 38 additions & 0 deletions .github/scripts/set_test_matrix.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
"""
Sets the strategy matrix for the functional ci tests.
This mimics the discovery strategy used by pytest for files inside the tests/ folder
and shards them based on the top level parent folders.

Expects to be executed in a GHA envirionment, with GITHUB_OUTPUT context available.
"""
import glob
import json
import os
from pathlib import Path

patterns = ('test_*.py', '*_test.py') # the tuple of file types
files_matched = []
for pattern in patterns:
files_matched.extend(glob.glob(f"tests/**/{pattern}", recursive=True))

testing_paths = []
for path in files_matched:
current_path = Path(path)
testing_paths.append("/".join(current_path.parts[:2]))

include = []

for path in list(set(testing_paths)):
include.append({
"path": path
})

matrix = {
"include": include
}

matrix_str = json.dumps(matrix)
name = 'matrix'
value = matrix_str
with open(os.environ['GITHUB_OUTPUT'], 'a') as fh:
print(f'{name}={value}', file=fh)
22 changes: 14 additions & 8 deletions .github/scripts/start_instances.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ def run_command(cmd):

# e.g., ' 1, 2, ,3,,' -> ['1', '2', '3']
hw_config_ids = os.environ.get('HW_CONFIG').replace(' ', '')
hw_config_ids = '2'
repo_org = os.environ.get('REPO_NAME').split('/')[0]
if hw_config_ids == 'all':
query = (
Expand Down Expand Up @@ -96,14 +97,19 @@ def run_command(cmd):

# Start all instances
for instance in instances:
cloud_provider_id, instance_id, _ = instance
if cloud_provider_id == 1: # AWS
cmd = ['aws', 'ec2', 'start-instances', '--instance-ids', instance_id]
elif cloud_provider_id == 2: # Always on, no need to launch. Do Nothing.
cmd = ['true']
else:
raise ValueError(f'Unknown cloud provider id: {cloud_provider_id}')
output = run_command(cmd)
for i in range(300):
cloud_provider_id, instance_id, _ = instance
if cloud_provider_id == 1: # AWS
cmd = ['aws', 'ec2', 'start-instances', '--instance-ids', instance_id]
elif cloud_provider_id == 2: # Always on, no need to launch. Do Nothing.
cmd = ['true']
else:
raise ValueError(f'Unknown cloud provider id: {cloud_provider_id}')
output = run_command(cmd)
if output.returncode == 0:
break
time.sleep(60)

if output.returncode != 0:
raise RuntimeError(f'Failed to start instance {instance_id} on cloud provider {cloud_provider_id}.')

Expand Down
7 changes: 7 additions & 0 deletions .github/workflows/lint.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@ jobs:
pip install torch torchvision torchaudio
pip install -r requirements.txt
pip install -r requirements-dev.txt
sudo apt-get update
sudo apt-get install clang-format
- name: Format with black
run: |
# stop the build if format is not correct
Expand All @@ -36,3 +38,8 @@ jobs:
run: |
echo "Running with" $(pip freeze | grep "pylint")
python -m pylint --rcfile ./scripts/lint/pylintrc -j $(nproc) ./python/hidet
- name: Format with clang-format
run: |
echo "Running with" $(clang-format --version)
find ./src ./include -iname '*.h' -o -iname '*.cpp' \
| xargs clang-format -style=file:scripts/lint/.clang-format --dry-run -Werror
2 changes: 1 addition & 1 deletion .github/workflows/regression.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:

- name: Run main Python script
id: run_py_script
run: timeout 900 python ./.github/scripts/start_instances.py
run: timeout 36000 python ./.github/scripts/start_instances.py
env:
# TODO: Allow launching only specified GPU instances
HW_CONFIG: all
Expand Down
110 changes: 80 additions & 30 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,15 @@ on:
pull_request:
workflow_call:

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.sha }}
cancel-in-progress: true

jobs:
tests:

build-docs:
if: github.repository == 'CentML/hidet' || github.repository == 'hidet-org/hidet'
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
runs-on: [self-hosted, Linux, X64, gpu]
runs-on: arc-runner-set
container:
image: nvidia/cuda:11.8.0-devel-ubuntu20.04
options: --gpus all
Expand All @@ -22,10 +24,10 @@ jobs:
run: |
apt update && DEBIAN_FRONTEND=noninteractive apt install -y ccache git graphviz

- uses: actions/checkout@v3
- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: "3.8"

Expand All @@ -34,41 +36,89 @@ jobs:
with:
cmake-version: '3.19.x'

- name: Install dependencies via pip
- name: Setup Hidet
uses: ./.github/actions/setup-hidet

- name: List installed packages
run: |
python -m pip install --upgrade pip
pip install torch torchvision torchaudio
pip install -r requirements.txt
pip install -r requirements-dev.txt
pip list

- name: Build hidet
- name: Install docs dependencies
run: |
bash scripts/wheel/build_wheel.sh
WHEEL=$(find ./scripts/wheel/built_wheel -maxdepth 1 -name '*.whl')
echo "WHEEL_NAME=$WHEEL" >> $GITHUB_ENV
echo "Built wheel: ${{ env.WHEEL_NAME }}"
pip install -r docs/requirements.txt

- name: Build docs
run: |
cd docs; make clean; make html

list-test-dirs:
if: github.repository == 'CentML/hidet' || github.repository == 'hidet-org/hidet'
runs-on: ubuntu-latest
outputs:
matrix: ${{ steps.set-matrix.outputs.matrix }}
steps:

- name: Checkout Hidet
uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.8"

- name: Install hidet
- id: set-matrix
run: |
pip install --no-deps --force-reinstall ${{ env.WHEEL_NAME }}
python .github/scripts/set_test_matrix.py

run-test:
needs: list-test-dirs
strategy:
fail-fast: false
matrix: ${{ fromJSON(needs.list-test-dirs.outputs.matrix) }}
runs-on: arc-runner-set
container:
image: nvidia/cuda:11.8.0-devel-ubuntu20.04
options: --gpus all
steps:
- name: Install dependencies via apt
run: |
apt update && DEBIAN_FRONTEND=noninteractive apt install -y ccache git graphviz

- uses: actions/checkout@v4

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.8"

- name: Setup cmake
uses: jwlawson/actions-setup-cmake@v2
with:
cmake-version: '3.19.x'

- name: Setup Hidet
uses: ./.github/actions/setup-hidet

- name: List installed packages
run: |
pip list

# Run tests

- name: Run tests
run: |
rm -rf ~/.config/hidet
python -m pytest -v --durations=20 --clear-cache ./tests

# Build the docs

- name: Install docs dependencies
run: |
pip install -r docs/requirements.txt
python -m pytest -v --durations=20 --clear-cache ${{ matrix.path }}

- name: Build docs
run: |
cd docs; make clean; make html
final-status-indicator:
if: ${{ always() }}
runs-on: ubuntu-latest
name: Pass All Functional Tests
needs: [run-test]
steps:
- run: exit 1
if: >-
${{
contains(needs.*.result, 'failure')
|| contains(needs.*.result, 'cancelled')
|| contains(needs.*.result, 'skipped')
}}
8 changes: 8 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ add_library(hidet_runtime SHARED
src/hidet/runtime/callbacks.cpp
src/hidet/runtime/logging.cpp
src/hidet/runtime/symbols.cpp
src/hidet/runtime/llm/tokenizer/decoders.cpp
src/hidet/runtime/llm/tokenizer/models.cpp
src/hidet/runtime/llm/tokenizer/normalizers.cpp
src/hidet/runtime/llm/tokenizer/pattern.cpp
src/hidet/runtime/llm/tokenizer/postprocessors.cpp
src/hidet/runtime/llm/tokenizer/pretokenizers.cpp
src/hidet/runtime/llm/tokenizer/tokenizer.cpp
src/hidet/runtime/llm/tokenizer/utf8.cpp
)
target_include_directories(hidet_runtime PRIVATE ${CMAKE_SOURCE_DIR}/include /usr/include)
set_target_properties(hidet_runtime PROPERTIES LIBRARY_OUTPUT_DIRECTORY ${CMAKE_BINARY_DIR}/lib)
Expand Down
1 change: 0 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ x = torch.rand(1, 3, 224, 224).cuda()
# Optional: set optimization options (see our documentation for more details)
# import hidet
# hidet.torch.dynamo_config.search_space(2) # tune each tunable operator
# hidet.torch.dynamo_config.use_fp16() # use float16 for acceleration
model_opt = torch.compile(model, backend='hidet')

# Run the optimized model
Expand Down
2 changes: 1 addition & 1 deletion apps/compile_server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,5 @@ hidet.option.compile_server.password('password')
hidet.option.compile_server.repo('https://github.com/hidet-org/hidet', 'main')

# enable the compile server
hidet.option.compile_server.enable()
hidet.option.compile_server.enable()
```
6 changes: 3 additions & 3 deletions gallery/developer-guides/add-torch-operator-mapping.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@
from torch import nn

# hidet employs an interpreter to convert a fx.Graph to FlowGraph
from hidet.graph.frontend.torch.interpreter import Registry
from hidet.graph.frontend.torch.registry import Registry

# the following three modules register the conversion rules
import hidet.graph.frontend.torch.register_functions
Expand Down Expand Up @@ -91,7 +91,7 @@ def forward(self, x):

def run_model():
model = Model().cuda()
model_opt = torch.compile(model, backend='hidet')
model_opt = torch.compile(model, backend='hidet', mode='max-autotune')

x = torch.randn(10, 10, device='cuda')
y1 = model_opt(x)
Expand All @@ -112,7 +112,7 @@ def run_model():
from typing import Optional
from hidet import ops
from hidet import Tensor
from hidet.graph.frontend.torch.interpreter import (
from hidet.graph.frontend.torch.registry import (
register_function,
register_module,
register_method,
Expand Down
5 changes: 1 addition & 4 deletions gallery/getting-started/quick-start.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,8 @@
model = torch.hub.load('pytorch/vision:v0.9.0', 'resnet18', pretrained=True, verbose=False)
model = model.cuda().eval()

# uncomment the following line to enable kernel tuning
# hidet.torch.dynamo_config.search_space(2)

# optimize the model with 'hidet' backend
model_opt = torch.compile(model, backend='hidet')
model_opt = torch.compile(model, backend='hidet', mode='max-autotune')

# run the optimized model
y1 = model_opt(x)
Expand Down
Loading
Loading