This repository has been archived by the owner on Nov 25, 2022. It is now read-only.
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[MetaSchedule] Improve the script for TorchBench model tuning & bench…
…marking (apache#13255) This PR adds features to the `python/tvm/meta_schedule/testing/torchbench/run.py`. - Integrate with the TVM PyTorch integration to handle boolean tensor and unaligned memory. - Deduplicate collected tuning tasks to prevent thousands of tasks created by hundreds of subgraphs with similar structure. - Add option to cast model to float32, which are more stable numerically than float16 and prevents inaccurate result from many models. - Add option to choose search strategy in MetaSchedule. - Inspect output error if the actual output doesn't match the expectation. Also save the actual output and expected output for further analysis if needed. - Save subgraphs and their example input for debug purpose. - Print MetaSchedule profiling information at the end of execution. - Detach PyTorch tensor before exporting to dlpack. - Fix the sys path to avoid conflict with the `benchmarks` package installed by TorchBench dependency. - Trim all command line args passed in, in order to prevent breaking some TorchBench model that depends on args. - Empty cuda cache before starting the actual benchmark.
- Loading branch information