Skip to content

NPU-GPU-Pipeline could not run on NPU #156

@carlliusly1

Description

@carlliusly1

I found that onnxruntime package which is mandatory for NPU-GPU-Pipeline demo has some conflicts with onnxruntime_vitisai package which is needed by NPU runtime.
before I install the onnxruntime package, the log of quicktest is as follow:

**(npu-gpu-pipeline) C:\Program Files\RyzenAI\1.3.1\quicktest>python quicktest.py
Setting environment for PHX/HPT
XLNX_VART_FIRMWARE= C:\Program Files\RyzenAI\1.3.1\voe-4.0-win_amd64\xclbins\phoenix\1x4.xclbin
NUM_OF_DPU_RUNNERS= 1
XLNX_TARGET_NAME= AMD_AIE2_Nx4_Overlay
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20250214 08:48:26.537966  5244 vitisai_compile_model.cpp:1046] Vitis AI EP Load ONNX Model Success
I20250214 08:48:26.537966  5244 vitisai_compile_model.cpp:1047] Graph Input Node Name/Shape (1)
I20250214 08:48:26.537966  5244 vitisai_compile_model.cpp:1051]          input : [-1x3x32x32]
I20250214 08:48:26.537966  5244 vitisai_compile_model.cpp:1057] Graph Output Node Name/Shape (1)
I20250214 08:48:26.537966  5244 vitisai_compile_model.cpp:1061]          output : [-1x10]
[Vitis AI EP] No. of Operators :   CPU     2    NPU   398
[Vitis AI EP] No. of Subgraphs :   NPU     1 Actually running on NPU     1
2025-02-14 08:48:26.6552127 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-02-14 08:48:26.6610131 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
Test Passed**

After installing innxruntime package:


**(npu-gpu-pipeline) C:\Program Files\RyzenAI\1.3.1\quicktest>python quicktest.py
Setting environment for PHX/HPT
XLNX_VART_FIRMWARE= C:\Program Files\RyzenAI\1.3.1\voe-4.0-win_amd64\xclbins\phoenix\1x4.xclbin
NUM_OF_DPU_RUNNERS= 1
XLNX_TARGET_NAME= AMD_AIE2_Nx4_Overlay
C:\Users\m1860\.conda\envs\npu-gpu-pipeline\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py:69: UserWarning: Specified provider 'VitisAIExecutionProvider' is not in available provider names.Available providers: 'AzureExecutionProvider, CPUExecutionProvider'
  warnings.warn(
Test Passed**

I have tried force reinstall the onnxruntime_vitisai after installing onnxruntime, the quicktest could be OK, but when running NPU-GPU-Pipeline demo, it will report error as follow:


**(npu-gpu-pipeline) C:\Users\m1860\tmp\RyzenAI-SW\demo\NPU-GPU-Pipeline>python pipeline.py -i test/test_img2img.mp4 --npu --provider_config vaip_config.json --igpu
C:\Users\m1860\AppData\Roaming\Python\Python310\site-packages\onnxscript\converter.py:823: FutureWarning: 'onnxscript.values.Op.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.
  param_schemas = callee.param_schemas()
C:\Users\m1860\AppData\Roaming\Python\Python310\site-packages\onnxscript\converter.py:823: FutureWarning: 'onnxscript.values.OnnxFunction.param_schemas' is deprecated in version 0.1 and will be removed in the future. Please use '.op_signature' instead.
  param_schemas = callee.param_schemas()
The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
Number of frames =  486
WARNING: Logging before InitGoogleLogging() is written to STDERR
I20250214 09:33:19.950776  9828 vitisai_compile_model.cpp:1046] Vitis AI EP Load ONNX Model Success
I20250214 09:33:19.950913  9828 vitisai_compile_model.cpp:1047] Graph Input Node Name/Shape (1)
I20250214 09:33:19.950913  9828 vitisai_compile_model.cpp:1051]          DetectionModel::input_0 : [-1x640x640x3]
I20250214 09:33:19.950913  9828 vitisai_compile_model.cpp:1057] Graph Output Node Name/Shape (3)
I20250214 09:33:19.950913  9828 vitisai_compile_model.cpp:1061]          3359_transpose : [-1x80x80x144]
I20250214 09:33:19.950913  9828 vitisai_compile_model.cpp:1061]          3560_transpose : [-1x40x40x144]
I20250214 09:33:19.950913  9828 vitisai_compile_model.cpp:1061]          3761_transpose : [-1x20x20x144]
[Vitis AI EP] No. of Operators :   CPU     4    NPU  1293
[Vitis AI EP] No. of Subgraphs :   NPU     1 Actually running on NPU     1
2025-02-14 09:33:20.3007945 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-02-14 09:33:20.3064642 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
[Vitis AI EP] No. of Operators :   CPU     2    NPU    77
[Vitis AI EP] No. of Subgraphs :   NPU     1 Actually running on NPU     1
2025-02-14 09:33:21.4460936 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2025-02-14 09:33:21.4503412 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
F20250214 09:33:21.634864  9828 custom_op.cpp:552] -- Error: Failed to create GE handle: Failed to create context virtual (0xc01e0009): There was an error while creating context
: invalid argument
2025-02-14 09:33:21.6693482 [E:onnxruntime:, inference_session.cc:2106 onnxruntime::InferenceSession::Initialize::<lambda_666f9e10d0809ac0097d66ff1b8ff07f>::operator ()] Exception during initialization: private: static void __cdecl google::protobuf::FieldDescriptor::TypeOnceInit(class google::protobuf::FieldDescriptor const * __ptr64)
public: virtual unsigned char * __ptr64 __cdecl google::protobuf::internal::ZeroFieldsBase::_InternalSerialize(unsigned char * __ptr64,class google::protobuf::io::EpsCopyOutputStream * __ptr64)const __ptr64
__CxxFrameHandler4
(unknown)
RtlCaptureContext2
public: void __cdecl vaip_core::DpuSubgraphEntryProto::unsafe_arena_set_allocated_try_fuse(class vaip_core::TryFuseProto * __ptr64) __ptr64
public: void __cdecl vaip_core::DpuSubgraphEntryProto::unsafe_arena_set_allocated_try_fuse(class vaip_core::TryFuseProto * __ptr64) __ptr64
(unknown)
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
public: void __cdecl pybind11::error_already_set::discard_as_unraisable(class pybind11::object) __ptr64
PyCFunction_GetFlags
_PyObject_MakeTpCall
PyMethod_Self
_PyOS_URandomNonblock
PyEval_GetFuncDesc
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyFunction_Vectorcall
_PyOS_URandomNonblock
PyEval_GetFuncDesc
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyFunction_Vectorcall

Traceback (most recent call last):
  File "C:\Users\m1860\tmp\RyzenAI-SW\demo\NPU-GPU-Pipeline\pipeline.py", line 487, in <module>
    ort_session_rcan = onnxruntime.InferenceSession(
  File "C:\Users\m1860\.conda\envs\npu-gpu-pipeline\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 419, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "C:\Users\m1860\.conda\envs\npu-gpu-pipeline\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py", line 491, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: private: static void __cdecl google::protobuf::FieldDescriptor::TypeOnceInit(class google::protobuf::FieldDescriptor const * __ptr64)
public: virtual unsigned char * __ptr64 __cdecl google::protobuf::internal::ZeroFieldsBase::_InternalSerialize(unsigned char * __ptr64,class google::protobuf::io::EpsCopyOutputStream * __ptr64)const __ptr64
__CxxFrameHandler4
(unknown)
RtlCaptureContext2
public: void __cdecl vaip_core::DpuSubgraphEntryProto::unsafe_arena_set_allocated_try_fuse(class vaip_core::TryFuseProto * __ptr64) __ptr64
public: void __cdecl vaip_core::DpuSubgraphEntryProto::unsafe_arena_set_allocated_try_fuse(class vaip_core::TryFuseProto * __ptr64) __ptr64
(unknown)
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
PyInit_onnxruntime_pybind11_state
public: void __cdecl pybind11::error_already_set::discard_as_unraisable(class pybind11::object) __ptr64
PyCFunction_GetFlags
_PyObject_MakeTpCall
PyMethod_Self
_PyOS_URandomNonblock
PyEval_GetFuncDesc
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyFunction_Vectorcall
_PyOS_URandomNonblock
PyEval_GetFuncDesc
_PyEval_EvalFrameDefault
_PyEval_EvalFrameDefault
_PyFunction_Vectorcall**

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions