Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansor][AutoTVM v2.0] Phase 0: Ansor minimum system for auto schedule generating #5962

Merged
merged 80 commits into from
Jul 15, 2020
Merged
Show file tree
Hide file tree
Changes from 61 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
7ee0902
Code migration Start (#1)
jcf94 May 26, 2020
9fcbf0b
Split transform_step out & Update more UTs (#3)
jcf94 May 27, 2020
f43e82f
Add search_task, measure and serialization (#4)
jcf94 May 28, 2020
e0a5ed5
Add MetaTileRewritePolicy (#5)
jcf94 May 29, 2020
359905a
Basic Python API for State (#6)
jcf94 Jun 3, 2020
2032a64
Add Python API: Measure & Task (#7)
jcf94 Jun 4, 2020
6b21dc6
Add ansor.auto_schedule() API; First AutoSchedule working version(#8)
jcf94 Jun 4, 2020
e52135f
Bug fix & Add python serialization API (#10)
jcf94 Jun 5, 2020
1fe6638
Improve code style, python wrapper and test cases (#11)
merrymercy Jun 7, 2020
43d1530
fix unit tests
merrymercy Jun 8, 2020
f367d15
Add RPCRunner & OpenCL/CUDA test (#12)
jcf94 Jun 8, 2020
2bd6471
rebase to upstream/master
merrymercy Jun 8, 2020
c860f2c
Add Ansor basic tutorial (#13)
jcf94 Jun 8, 2020
f60d1a6
migrate feature extraction (#14)
merrymercy Jun 8, 2020
b839c0f
Add XGBModel & RPCRunnerWarpper (#15)
jcf94 Jun 9, 2020
cfe58d7
Migrate workload_registry.py (#16)
merrymercy Jun 9, 2020
143ea45
add task scheduler (#17)
merrymercy Jun 9, 2020
ed075c2
Add conv2d cuda tutorial with workload registry (#18)
jcf94 Jun 9, 2020
74ec7d0
add tune_test.py (the old tune_wkl.py) (#19)
merrymercy Jun 9, 2020
cd0a516
Code refine for tune_test.py & Add a pre load callback (#20)
jcf94 Jun 10, 2020
3a24e49
Add python custom sketch rule (#21)
jcf94 Jun 11, 2020
a155c1f
Ansor Relay Integration (without layout rewrite) (#22)
minminsun Jun 12, 2020
674027f
Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)
jcf94 Jun 12, 2020
2f241ed
add explicit_unroll_max_extent (#25)
merrymercy Jun 12, 2020
18d44b8
Add Index simplification & API update (#26)
jcf94 Jun 15, 2020
4ea6712
Update PreLoadMeasuredStates & Some bug fix (#27)
jcf94 Jun 16, 2020
6126cdb
Add tensorize step for loop_state (#31)
jcf94 Jun 19, 2020
c7364df
State python api update (#33)
jcf94 Jun 19, 2020
36cd9ef
kernel layout rewrite (#28)
minminsun Jun 19, 2020
145e61c
[cache flush] port cache flush to ansor (#32)
FrozenGene Jun 19, 2020
2c27816
Improve relay integration (#34)
merrymercy Jun 20, 2020
0794875
Fix xgb error & Simplify dispatcher (#35)
merrymercy Jun 20, 2020
a4c4548
Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)
merrymercy Jun 20, 2020
593a2c7
rebase
merrymercy Jun 20, 2020
53bd591
Migrate all node::make to noderef's construct function (#37)
jcf94 Jun 22, 2020
8e53d12
Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)
jcf94 Jun 23, 2020
cd5c5ad
Add MutateComputeLocation and MutateParallel in evolutionary search (…
merrymercy Jun 23, 2020
5860191
Improve loop state python API (stage_tensors -> stage_ops) (#41)
merrymercy Jun 23, 2020
14a19cd
ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)
jcf94 Jun 24, 2020
b012e27
Rever Commits, Start to build minimum Ansor system
jcf94 Jun 24, 2020
d6d6b85
Code clean for minimum Ansor system
jcf94 Jun 24, 2020
4042cfa
Bug fix & Delete AccessAnalyzer
jcf94 Jun 28, 2020
7695def
Delete attachmap & Code clean
jcf94 Jun 28, 2020
0c200cd
Doc update
jcf94 Jun 28, 2020
9c35e50
Headfile update & Python doc update
jcf94 Jun 28, 2020
a015051
clang-format fix
jcf94 Jun 29, 2020
6823802
pylint fix
jcf94 Jun 29, 2020
a82dbb8
Update
jcf94 Jun 29, 2020
ac36c46
Doc update
jcf94 Jun 29, 2020
a62b1e0
Update
jcf94 Jun 30, 2020
3eac89d
Merge branch 'upstream_master' into upstream_0_new
jcf94 Jun 30, 2020
526cf42
Bug fix after code merge to the new master
jcf94 Jun 30, 2020
426ec82
clang-format fix
jcf94 Jun 30, 2020
907c17c
Update
jcf94 Jul 1, 2020
64f8f8d
Update
jcf94 Jul 1, 2020
1b16dd4
Update std::vector to Array; Update verbosity setting; Some commemts
jcf94 Jul 1, 2020
9fa897b
std::vector->Array & std::string->String
jcf94 Jul 2, 2020
f40c7af
Add init_state to ComputeDAG
jcf94 Jul 2, 2020
0a24daf
Update
jcf94 Jul 2, 2020
a45fd89
Update some unordered_map to Map
jcf94 Jul 2, 2020
bfc6663
clang-format fix
jcf94 Jul 2, 2020
eb02e77
Comments addressed
jcf94 Jul 3, 2020
cb2442f
Lint fix
jcf94 Jul 3, 2020
b1ca20c
Update
jcf94 Jul 3, 2020
49dbec6
Merge branch 'upstream_master' into upstream_0_new
jcf94 Jul 3, 2020
8add768
Update
jcf94 Jul 3, 2020
78e5313
Update
jcf94 Jul 4, 2020
546abbe
Update
jcf94 Jul 4, 2020
d418a57
Update
jcf94 Jul 5, 2020
8e1d65d
Update
jcf94 Jul 5, 2020
3a67a72
Update
jcf94 Jul 9, 2020
28a7b8f
Update
jcf94 Jul 9, 2020
1360b1b
Update
jcf94 Jul 9, 2020
52afe74
Rename ansor namespace to auto_schedule
jcf94 Jul 11, 2020
6a61fb6
Update
jcf94 Jul 11, 2020
3a4e5da
Rename ThreadPool to ParallelFor
jcf94 Jul 14, 2020
dbe019b
Add parallel_for
jcf94 Jul 14, 2020
1f1b878
Remove ThreadPool
jcf94 Jul 14, 2020
02fede9
Update python/tvm/auto_schedule/auto_schedule.py
merrymercy Jul 14, 2020
eea0989
trigger CI
merrymercy Jul 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,7 @@ assign_source_group("Include" ${GROUP_INCLUDE})

# Source file lists
file(GLOB_RECURSE COMPILER_SRCS
src/ansor/*.cc
src/node/*.cc
src/ir/*.cc
src/arith/*.cc
Expand Down
34 changes: 34 additions & 0 deletions python/tvm/ansor/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=unused-import, redefined-builtin
"""Namespace for Ansor auto-scheduler"""

from . import compute_dag
from . import measure
from . import serialization
from . import loop_state
from . import utils
from . import workload_registry

# Shortcut
from .compute_dag import ComputeDAG
from .auto_schedule import SearchTask, TuneOption, HardwareParams, \
auto_schedule, EmptyPolicy
from .measure import MeasureInput, LocalBuilder, LocalRunner
from .serialization import LogToFile, LogReader, best_measure_pair_in_file, \
load_from_file, append_measure_records_to_file
from .workload_registry import register_workload_by_func, make_workload_key_by_func
22 changes: 22 additions & 0 deletions python/tvm/ansor/_ffi_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

"""Register FFI APIs from C++ for the namespace tvm.ansor"""
import tvm._ffi


tvm._ffi._init_api("ansor", __name__)
206 changes: 206 additions & 0 deletions python/tvm/ansor/auto_schedule.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

"""
User interface for Ansor auto-scheduler.

The basic schedule search process for Ansor is designed to be:
`Program sampling` -> `Performance Tuning`.

In `Program sampling`, we use some predefined or heuristic rules to generate several initial
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
schedules. Based on these initial start points, we have `Performance Tuning` to apply cost model
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
and evolutionary search to seek for schedules with the best performance. Candidate schedules will
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
be measured in the target hardware.
"""

import tvm._ffi
from tvm.runtime import Object
from .compute_dag import ComputeDAG
from .measure import LocalBuilder, LocalRunner
from . import _ffi_api


@tvm._ffi.register_object("ansor.HardwareParams")
class HardwareParams(Object):
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
""" The parameters of target hardware, this is used to guide the search process of
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
SearchPolicy.

TODO(...): This is considering to merge with the new Target:
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844

Parameters
kevinthesun marked this conversation as resolved.
Show resolved Hide resolved
----------
num_cores : int
The number of device cores.
vector_unit_bytes : int
The width of vector units in bytes.
cache_line_bytes : int
The size of cache line in bytes.
max_unroll_vec : int
The max length of an axis to be unrolled or vectorized.
max_innermost_split_factor : int
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
The max split factor for the innermost tile.
"""
def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes,
max_unroll_vec, max_innermost_split_factor):
self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
vector_unit_bytes, cache_line_bytes,
max_unroll_vec, max_innermost_split_factor)


@tvm._ffi.register_object("ansor.SearchTask")
class SearchTask(Object):
""" The meta-information of a search task.
jcf94 marked this conversation as resolved.
Show resolved Hide resolved

Parameters
----------
dag : ComputeDAG
The ComputeDAG for target compute declaration.
workload_key : str
The workload key for target compute declaration.
target : tvm.target.Target
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
The target device of this search task.
target_host : Optional[tvm.target.Target]
The target host device of this search task.
hardware_params : Optional[HardwareParams]
Hardware parameters used in this search task.
"""
def __init__(self, dag, workload_key, target, target_host=None,
hardware_params=None):
self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
workload_key, target, target_host,
hardware_params)


@tvm._ffi.register_object("ansor.SearchPolicy")
class SearchPolicy(Object):
""" The base class for search policy """
jcf94 marked this conversation as resolved.
Show resolved Hide resolved


@tvm._ffi.register_object("ansor.EmptyPolicy")
class EmptyPolicy(SearchPolicy):
""" This is an example empty search policy which will always generate
the init state of target ComputeDAG.
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
"""
def __init__(self):
self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)


@tvm._ffi.register_object("ansor.TuneOption")
class TuneOption(Object):
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
""" This controls the options of performance tuning.

Parameters
----------
num_measure_trials: int = 0
The number of total schedule measure trials.
Ansor takes `num_measure_trials` state for measuring in total, and finally gets the best
schedule among them.
With `num_measure_trials` == 0, Ansor will do the schedule search but don't involve
measurement, this can be used if we want to quickly get a runnable schedule without
performance tuning.
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
early_stopping: int = -1
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
Stops early the tuning if no improvement get after n measurements.
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
num_measures_per_round: int = 64
The number of programs to be measured at each search round.
The whole schedule search process is designed to have several rounds to try a total
`num_measure_trials` schedules.
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
We have: `num_search_rounds` = `num_measure_trials` // `num_measures_per_round`
verbose: int = 1
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
Verbosity level. 0 for silent, 1 to output information during schedule search.
builder: Union[Builder, str] = 'local'
Builder which builds the program.
runner: Union[Runner, str] = 'local'
Runner which runs the program and measures time costs.
measure_callbacks: Optional[List[MeasureCallback]]
Callback functions called after each measure.
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
Candidates:
- ansor.LogToFile
pre_search_callbacks: Optional[List[SearchCallback]]
Callback functions called before the search process.
Candidates:
- ansor.PreloadMeasuredStates
- ansor.PreloadCustomSketchRule
TODO(jcf94): Add these implementation in later PRs.
"""
def __init__(self, num_measure_trials=0, early_stopping=-1, num_measures_per_round=64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early_stopping -> early_termination

IMHO, this API looks a bit bulky to me, should we have some config dict to do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree, there are lots of fields here and its a bit hard to consume

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinions, TuningOptions is already a class holding configurations related to schedule tuning stuffs, I think it might be a little bit overkill to introduce another config dict?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @yangjunpro, this class is a fine way of collecting the tuning options, separating another dict out is messier.

verbose=1, builder='local', runner='local', measure_callbacks=None,
pre_search_callbacks=None):
if isinstance(builder, str):
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
if builder == 'local':
builder = LocalBuilder()
else:
raise ValueError("Invalid builder: " + builder)

if isinstance(runner, str):
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
if runner == 'local':
runner = LocalRunner()
else:
raise ValueError("Invalid runner: " + runner)

measure_callbacks = [] if measure_callbacks is None else measure_callbacks
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
pre_search_callbacks = [] if pre_search_callbacks is None else pre_search_callbacks
jcf94 marked this conversation as resolved.
Show resolved Hide resolved

self.__init_handle_by_constructor__(
_ffi_api.TuneOption, num_measure_trials, early_stopping, num_measures_per_round,
verbose, builder, runner, measure_callbacks, pre_search_callbacks)


def auto_schedule(task, target, target_host=None, search_policy='default',
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
hardware_params=None, tune_option=None):
""" Do auto scheduling for a computation declaration.

The task parameter can be a `string` as workload_key, or directly
passing a `SearchTask` as input.

Parameters
----------
task : Union[SearchTask, str]
The target search task or workload key.
target : tvm.target.Target
The target device of this schedule search.
target_host : Optional[tvm.target.Target]
The target host device of this schedule search.
search_policy : Union[SearchPolicy, str] = 'default'
The search policy to be used for schedule search.
hardware_params : Optional[HardwareParams]
The hardware parameters of this schedule search.
tune_option : Optional[TuneOption]
Tuning and measurement options.

Returns
-------
A `te.schedule` and the target `te.Tensor`s to be used in `tvm.lower` or `tvm.build`
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
"""
if isinstance(search_policy, str):
if search_policy == 'default':
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
# TODO(jcf94): This is an example policy for minimum system, will be upgrated to
# formal search policy later.
search_policy = EmptyPolicy()
else:
raise ValueError("Invalid search policy: " + search_policy)

tune_option = tune_option if tune_option else TuneOption()

if isinstance(task, str):
dag = ComputeDAG(task)
task = SearchTask(dag, task, target, target_host, hardware_params)
elif not isinstance(task, SearchTask):
raise ValueError("Invalid task: " + task + ". Expect a string or SearchTask")
jcf94 marked this conversation as resolved.
Show resolved Hide resolved

sch, tensors = _ffi_api.AutoSchedule(task, search_policy, tune_option)
return sch, tensors
Loading