Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preetha/workload type rebased #446

Closed

Conversation

preetha-intel
Copy link

Description

Add support for workload type as session_option and runtime_option

preetha-intel and others added 12 commits September 6, 2024 20:58
* Implements blob compatibility check for NPU

* OVEP catches the NPU driver exception and return failure status

* NPU to CPU fallback is disabled when inferencing with blob

* Update NPU device exception handling approach

* Changes failure status code to exception (std::runtime_error)

* Capture all NPU related errors

* Throw minimal error message with error type and error code for Release
  builds

* Fix lint issues

* Address review comments

* Address review comments

---------

Co-authored-by: Srirammaswamy <srirammaswamy.s@intel.com>
…PU (#441)

* Prototype shared memory allocator on Windows using OV-EP

* Partially working allocator.

Crashing on tensor destruction. Might have UMD exceptions. Needs further
debug. Unknown if values are correct.

* Hard code onnx perf to use RT NPU allocator for inputs

* Fix allocation lookups coming from different level zero contexts

* Page align OV allocation

* Allocate input as WC

* Only set tensors when they have changed.

* Revert "Allocate input as WC"

This reverts commit d43219f.

* Hard code onnx perf to use RT NPU for outputs

* Revert "Hard code onnx perf to use RT NPU for outputs"

This reverts commit c1f3b3e.

* Hard code onnx perf to use RT NPU for outputs fixed

* Fix onnx_perf_test app crash on tensor destroy

* refactor: remove redundant ort_shape_to_ovshape lambda function

* alocate buffer in NPU visible region from perf test application

* remove redundant code

* add command line parameter in perf test for using remote tensors

* remove redundant code

* remove redundant statements

* fix crash during inference

* remove redundant code

* enable backward compatibility of remote tensor feature

* Revert "enable backward compatibility of remote tensor feature"

This reverts commit 1791b90.

* enable backward compatibility of remote tensor feature in OVEP

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Disable driver caching for NPU when epctx enabled for ov version greater then 2024.3
* fix debug build issue and lint issues

* change naming for OVEP NPU specific macro

* fix unit tests and lint issues
@preetha-intel preetha-intel marked this pull request as draft October 1, 2024 13:51
@preetha-intel
Copy link
Author

Outdated.
Refer to microsoft#22282 for reimplementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants