TVM Vertical Integration with PyTorch #11911

juda · 2022-06-27T14:10:42Z

The pull request contains two functions:

optimize_torch as a function similar to torch.jit.trace, which is used to optimize the torch.nn.module by TVM metaSchedule, and returns a custom TorchScript operator
as_torch as a decorator, which is used to wrap the TVMscript code to torch.nn.module.

The files consist of:

Two python codes for both functions and a C++ backend
Two test files for testing both functions

@yelite @junrushao1994 @masahi

Co-authored-by: Lite Ye <liteye859@gmail.com>

masahi · 2022-06-27T22:55:49Z

I suggest moving tutorials to a separate PR. Ideally, tutorials should demonstrate more realistic examples than vector add or matmul, i.e. something PyTorch users would reach for "custom op" authoring.

For example, I think demonstrating equivalent of Triton fused softmax tutorial https://triton-lang.org/master/getting-started/tutorials/02-fused-softmax.html in this workflow would be very interesting.

juda · 2022-06-28T14:34:48Z

I suggest moving tutorials to a separate PR. Ideally, tutorials should demonstrate more realistic examples than vector add or matmul, i.e. something PyTorch users would reach for "custom op" authoring.

For example, I think demonstrating equivalent of Triton fused softmax tutorial https://triton-lang.org/master/getting-started/tutorials/02-fused-softmax.html in this workflow would be very interesting.

After discussing with @yelite , we will drop the how-to guides and will resubmit a separate PR afterward.

apps/pt_tvmdsoop/tests/test_as_torch.py

apps/pt_tvmdsoop/tests/test_optimize_torch.py

python/tvm/contrib/torch/as_torch.py

python/tvm/contrib/torch/optimize_torch.py

masahi · 2022-07-11T00:34:29Z

src/contrib/torch/pt_call_tvm/RuntimeModuleWrapper.cc

+ * The basic forward function calling TVM's runtime is provided.
+ * The TVM module can be serialized/deserialized as a Torch module.
+ */
+class GraphExecutorFactoryWrapper : public torch::jit::CustomClassHolder {


This looks similar to TvmGraphModulePack:

tvm/src/contrib/torch/pt_call_tvm/tvm_class.cc

Line 40 in e7024fb

class TvmGraphModulePack {

Why do we need this?

one reason is that we don't want to use temp files to transmit data, as bytedance's approach, but use TVM's FFI. @yelite

Hi @masahi , there are several reasons we don't plan to use codes from tvm_class.cc:

tvm_class.cc is complex while our code is more natural. For example, they maintain a torch's tensor to DLpack by themselves, while we use torch's built-in library.

Our code is more readable. We have less functions but could cover tvm_class.cc's functionality. For example, we don't need to have an extra initialization function init or loadTVMmodule.

tvm_class.cc uses tempfile and absolute path to transmit TVM module while we use TVM's FFI, which is a better practice I believe

The most significant difference is save/load functions. I tested that if we save a torch model via tvm_class.cc and then restart the python kernel, we cannot load the model back successfully because of (3). Our code can arbitrarily save/load models anywhere anytime because we serialize/deserialize the whole runtime module.

If GraphExecutorFactoryWrapper is strictly better than existing one, I want to see the existing one removed or reimplemented in terms of GraphExecutorFactoryWrapper. But this can be done in a follow up.

src/contrib/torch/pt_call_tvm/RuntimeModuleWrapper.cc

python/tvm/contrib/torch/optimize_torch.py

masahi · 2022-07-13T08:29:45Z

src/contrib/torch/pt_call_tvm/RuntimeModuleWrapper.cc

+ * The basic forward function calling TVM's runtime is provided.
+ * The TVM module can be serialized/deserialized as a Torch module.
+ */
+class GraphExecutorFactoryWrapper : public torch::jit::CustomClassHolder {


If GraphExecutorFactoryWrapper is strictly better than existing one, I want to see the existing one removed or reimplemented in terms of GraphExecutorFactoryWrapper. But this can be done in a follow up.

masahi · 2022-07-13T08:31:07Z

python/tvm/contrib/torch/optimize_torch.py

+    save_runtime_mod = get_global_func("tvmtorch.save_runtime_mod")
+    save_runtime_mod(executor_factory.module)
+
+    return GraphExecutorFactoryWrapper(torch.classes.tvm_tuning.GraphExecutorFactoryWrapper())


This looks strange... Why torch.classes.tvm_tuning.GraphExecutorFactoryWrapper() doesn't take any argument?

The class GraphExecutorFactoryWrapper is the subclass of Torch's module, and Torch's FFI cannot recognize TVM's datastructure, thus we transmit the runtime module by TVM's FFI.
Concretely, in line 185, we store the module in memory.
When the constructor of GraphExecutorFactoryWrapper is called, it will get the TVM's runtime module in the memory.
The Python class GraphExecutorFactoryWrapper is just a wrapper of the output because C++ doesn't support tuple unpacking but we do need this function in Python.

I see now that the compiled module is passed between python and C++ PyTorch by a thread local storage (stored by save_runtime_mod).

masahi · 2022-07-13T08:36:00Z

python/tvm/contrib/torch/as_torch.py

+        return self.rt_module.forward(torch_inputs)
+
+
+def as_torch(func: Union[tvm.ir.module.IRModule, tvm.tir.function.PrimFunc, Callable]):


So as_torch doesn't provide tuning facilities? I noticed that all tuning tests in this PR is done via optimize_torch which involves Relay. If a user wants to tune a TVMScript-written op and use @as_torch decorator, how tuning can be done?

My understanding is that as_torch is just used to convert TVMscript to Torch.
Need to confirm with @yelite to see if we need to do more.

It's still possible for PT users to write TVMScript and use tune_tir to tune, and use as_torch to convert the tuned prim func to PT. We are offering optimize_pytorch to wrap tune_relay, so it would be nice if as_torch also wraps tune_tir and automatically does tuning.

Current examples only show the usage of as_torch as an decorator on top of a manually written TVMScript without tuning.

I have added the tune_tir in as_torch

How about this: We add tune(config) (with optional config param like optimize_torch) method on OperatorModuleWrapper, which does tuning and rebuild the mod. And remove tune_tir from build(...). So by default tuning won't happen, but the user can explicitly ask to tune.

src/contrib/torch/pt_call_tvm/RuntimeModuleWrapper.cc

masahi · 2022-07-14T07:38:36Z

python/tvm/contrib/torch/as_torch.py

+        return sch
+
+    def build(self, target=None):
+        tuned_module = self.tune_tir_auto(self.ir_module)


If the input TVMScript module doesn't have any tunable knobs, does this tune_tir_auto finish instantly? Tuning should be an opt-in feature.

masahi · 2022-07-14T08:32:04Z

python/tvm/contrib/torch/as_torch.py

+        mod = default_config.mod(mod)
+        target = default_config.target(target)
+
+        extracted_task = ExtractedTask(


I think there is always only one task, since it is tuning a single op

python/tvm/contrib/torch/as_torch.py

python/tvm/contrib/torch/optimize_torch.py

masahi · 2022-07-26T06:51:55Z

python/tvm/contrib/torch/optimize_torch.py

+            "For optimal performance, it is recommended to provide",
+            "the `tuning_config` argument with a bigger number of trials.",
+        )
+        warnings.warn(" ".join(warning_msg), stacklevel=2)


It seems the default tuning config is dropped? @juda

It is moved to line 111 because we need to get extracted_tasks in advance

* optimize_torch & as_torch * split files * code formatting * optimizing optimized_torch * scrap your boilerplate * as_torch polished * configuration fixed * Apply suggestions from code review Co-authored-by: Lite Ye <liteye859@gmail.com> * more document * file deleter * optimize deleter * drop how-to guides * clang-format-10 * formatter changes * reformat * reformat * reformat * reformatting * fixed * auto setting * fixed * split long string * tune_tir * upgrade as_torch * optimize as_torch * as_torch * fixed typo Co-authored-by: juda <yzhou@octoml.ai> Co-authored-by: Lite Ye <liteye859@gmail.com>

juda and others added 11 commits June 21, 2022 06:39

optimize_torch & as_torch

7a64997

split files

d12eb54

code formatting

9de5527

optimizing optimized_torch

e0703a6

scrap your boilerplate

e9622b3

as_torch polished

763806c

configuration fixed

1068f7d

Apply suggestions from code review

76104b6

Co-authored-by: Lite Ye <liteye859@gmail.com>

more document

9acc648

file deleter

d594c48

optimize deleter

9cc0f75

juda changed the title ~~Totorch~~ [PyTorch Integration] Torch optimization and wrapper Jun 27, 2022

junrushao changed the title ~~[PyTorch Integration] Torch optimization and wrapper~~ TVM Vertical Integration with PyTorch Jun 27, 2022

drop how-to guides

64e48a7

juda added 7 commits June 30, 2022 01:56

clang-format-10

6691119

formatter changes

0dc4f42

Merge branch 'main' of github.com:juda/tvm into totorch

3436b35

reformat

79591ed

reformat

f8a63d3

reformat

4e2158e

reformatting

8584be2

masahi requested changes Jul 11, 2022

View reviewed changes

juda added 2 commits July 11, 2022 19:08

fixed

79570e8

auto setting

1218044

masahi reviewed Jul 13, 2022

View reviewed changes

juda added 3 commits July 13, 2022 05:04

fixed

2f39a5e

split long string

da484fe

tune_tir

fb620f6

masahi reviewed Jul 14, 2022

View reviewed changes

yelite reviewed Jul 14, 2022

View reviewed changes

python/tvm/contrib/torch/as_torch.py Outdated Show resolved Hide resolved

python/tvm/contrib/torch/as_torch.py Outdated Show resolved Hide resolved

yelite reviewed Jul 14, 2022

View reviewed changes

python/tvm/contrib/torch/optimize_torch.py Outdated Show resolved Hide resolved

juda added 3 commits July 14, 2022 20:10

upgrade as_torch

7c5f5cb

optimize as_torch

14a6cd1

as_torch

150ca1f

masahi approved these changes Jul 21, 2022

View reviewed changes

python/tvm/contrib/torch/optimize_torch.py Outdated Show resolved Hide resolved

fixed typo

7093b13

masahi reviewed Jul 26, 2022

View reviewed changes

masahi merged commit ea6ea42 into apache:main Jul 26, 2022

AndrewZhaoLuo mentioned this pull request Oct 4, 2022

TVM v0.10.0.rc0 Release Candidate Notes #12979

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TVM Vertical Integration with PyTorch #11911

TVM Vertical Integration with PyTorch #11911

juda commented Jun 27, 2022 •

edited

Loading

masahi commented Jun 27, 2022

juda commented Jun 28, 2022

masahi Jul 11, 2022

juda Jul 12, 2022

juda Jul 13, 2022 •

edited

Loading

masahi Jul 13, 2022

masahi Jul 13, 2022

masahi Jul 13, 2022

juda Jul 13, 2022

masahi Jul 13, 2022

masahi Jul 13, 2022

juda Jul 13, 2022

masahi Jul 13, 2022 •

edited

Loading

juda Jul 14, 2022

masahi Jul 14, 2022 •

edited

Loading

masahi Jul 14, 2022

masahi Jul 14, 2022

masahi Jul 26, 2022

juda Jul 26, 2022

		return self.rt_module.forward(torch_inputs)


		def as_torch(func: Union[tvm.ir.module.IRModule, tvm.tir.function.PrimFunc, Callable]):

TVM Vertical Integration with PyTorch #11911

TVM Vertical Integration with PyTorch #11911

Conversation

juda commented Jun 27, 2022 • edited Loading

masahi commented Jun 27, 2022

juda commented Jun 28, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juda Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi Jul 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

masahi Jul 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

juda commented Jun 27, 2022 •

edited

Loading

juda Jul 13, 2022 •

edited

Loading

masahi Jul 13, 2022 •

edited

Loading

masahi Jul 14, 2022 •

edited

Loading