[MetaSchedule] Improve the script for TorchBench model tuning & benchmarking #13255

yelite · 2022-11-01T16:28:44Z

This PR adds features to the python/tvm/meta_schedule/testing/torchbench/run.py.

Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory.
Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure.
Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models.
Add option to choose search strategy in MetaSchedule.
Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed.
Save subgraphs and their example input for debug purpose.
Print MetaSchedule profiling information at the end of execution.
Detach PyTorch tensor before exporting to dlpack.
Fix the sys path to avoid conflict with the benchmarks package installed by TorchBench dependency.
Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args.
Empty cuda cache before starting the actual benchmark.

tvm-bot · 2022-11-01T16:28:47Z

Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment.

cc @Hzfengsy, @elvin-n, @junrushao _{See #10317 for details}
Built docs for commit 1e54702 can be found here.

_{Generated by tvm-bot}

junrushao

LGTM!

…marking (apache#13255) This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`. - Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory. - Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure. - Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models. - Add option to choose search strategy in MetaSchedule. - Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed. - Save subgraphs and their example input for debug purpose. - Print MetaSchedule profiling information at the end of execution. - Detach PyTorch tensor before exporting to dlpack. - Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency. - Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args. - Empty cuda cache before starting the actual benchmark.

Improve the TorchBench tuning script

ac8f72b

Fix lint

1e54702

junrushao approved these changes Nov 3, 2022

View reviewed changes

junrushao merged commit b98b9f9 into apache:main Nov 3, 2022

leandron mentioned this pull request Feb 1, 2023

TVM v0.11.0 Release Candidate Notes #13899

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MetaSchedule] Improve the script for TorchBench model tuning & benchmarking #13255

[MetaSchedule] Improve the script for TorchBench model tuning & benchmarking #13255

yelite commented Nov 1, 2022 •

edited

Loading

tvm-bot commented Nov 1, 2022 •

edited

Loading

junrushao left a comment

[MetaSchedule] Improve the script for TorchBench model tuning & benchmarking #13255

[MetaSchedule] Improve the script for TorchBench model tuning & benchmarking #13255

Conversation

yelite commented Nov 1, 2022 • edited Loading

tvm-bot commented Nov 1, 2022 • edited Loading

junrushao left a comment

Choose a reason for hiding this comment

yelite commented Nov 1, 2022 •

edited

Loading

tvm-bot commented Nov 1, 2022 •

edited

Loading