-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TVM EP] support of TVM Virtual Machine #10341
Conversation
07724d5
to
1ba8fd5
Compare
void TVM_VM_SetInputs(TvmModule& mod, | ||
std::vector<DLTensor>& inputs) | ||
{ | ||
// TODO(vvchernov): set_input_zero_copy is more preferable but it does not satisfy alignment conditions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate more on what these alignment conditions are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've observed that VM does not support zero copy on TVM side. I've supported zero copy on TVM EP side for GE after data alignment check.
void TVMSetInputs(TvmModule& mod, std::vector<size_t>& inds, std::vector<DLTensor>& inputs); | ||
void TVM_VM_SetInputs(TvmModule& mod, std::vector<DLTensor>& inputs); | ||
void TVMGetOutputs(TvmModule& mod, std::vector<DLTensor>& outputs); | ||
void TVMGet_VM_Outputs(TvmModule& mod, std::vector<DLTensor>& outputs); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should name these functions either TVM_VM_SetInputs
and TVM_VM_GetInputs
or TVMSet_VM_Inputs
and TVMGet_VM_Inputs
for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! done
# TODO(vvchernov): replace prints by logger, but investigate ORT logging system for python before | ||
# see lines 69, 77, 119, 136 | ||
if executor == "vm": | ||
# print("Build TVM virtual machine") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean up this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
target_host=target_host, | ||
) | ||
elif executor == "graph": | ||
# print("Build TVM graph executor") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clean up this line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
tvm::TVMSetInputs(*mod_, inds, dl_tensors_inputs); | ||
if (use_vm_) { | ||
tvm::TVM_VM_SetInputs(*mod_, dl_tensors_inputs); | ||
// Infer once for calculating of output shapes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How will this affect benchmarking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first run we should do probe inference to define output tensor shapes. It needs for VM only. It does not affect our benchmarking because of warmup with several inferences is done before performance test.
35e520d
to
97da135
Compare
Hello @xadupre! Could you review this PR again? I did not skip WIP status due to we are waiting for merging of PR on TVM side, just now ONNX runtime references on TVM on local branch, but further the TVM commit hash will be changed only. I do not think it confuses you. |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
…ng related to target_host
Hello @xadupre, I think the PR is ready to merge. Could you recheck last changes and start CI tests. |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows WebAssembly CI Pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, onnxruntime-python-checks-ci-pipeline |
Azure Pipelines successfully started running 8 pipeline(s). |
@xadupre I've fixed flake8 issues, please restart CI pipelines |
/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline, MacOS CI Pipeline, ONNX Runtime Web CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline |
/azp run Windows CPU CI Pipeline, Windows GPU CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows WebAssembly CI Pipeline, orttraining-amd-gpu-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed, onnxruntime-python-checks-ci-pipeline |
Azure Pipelines successfully started running 9 pipeline(s). |
Azure Pipelines successfully started running 8 pipeline(s). |
Good evening @xadupre Could you restart orttraining-amd-gpu-ci-pipeline, there is failure due to samples-per-second mismatching for huggingface-gpt2 models. It does not related to the PR in any case. |
/azp run orttraining-amd-gpu-ci-pipeline |
Azure Pipelines successfully started running 1 pipeline(s). |
Hello @xadupre! All checks have passed successfully, it can be approved. |
* add executor option (vm or graph) and support virtual machine methods * nullptr check for compile and run methods (see also PR#10211 from microsoft:onnxruntime) * get output shapes for VM * remove run_with_benchmark. remove run methods from python api, get it from native side * get outputs method for VM was implemented * support multiple input for VM * update python logging and exception * small fix * update tvm with patch for VM API * update nhwc transformations for TVM EP * add data alignment check and support set_input_zero_copy for GE in TVM EP * fix logger name * return back to apache/tvm with VM fixes instead of local dev branch * hide customized tvm logger while issue is not resolved. fix tvm warning related to target_host * flake8 fix Co-authored-by: Valery Chernov <valery.chernov@deelvin.com> (cherry picked from commit 62cc981)
Support TVM Virtual Machine on TVM EP side
Waiting for merging of TVM PR (done) and ORT PR#10260 (done) and ORT PR#10505 (done) before check this PR