Allow zero-element tensors to get set #737
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR fixes an issue where zero-element tensors (typically when ort_ptr is nullptr) are not properly converted to OpenVINO tensors and set on the infer request.
Motivation and Context
For ORT GenAI use cases, it's possible that the application will pass zero-element tensors as input for inference. For example, for whisper pipeline through ORT GenAI, the first decode inference will set a 'past' KVCache tensor of shape
[1, 8, 0, 64]. As this has 0 total elements, typically the ort_ptr associated with it is also nullptr.Previously, this condition (
if (cached_binding.ort_ptr != ort_ptr) {) needed to be true in order to create an ov::Tensor and set it on the infer request. But, for a zero-element tensor, both of these are NULL and so the tensor does not get set -- which causes various issues.This PR adds an else that specifically checks for this 'zero-element' tensor case.
With this PR, I am able to run onnxruntime-genai whisper pipelines with OVEP.
Note: I didn't want to overcomplicate this PR, but I do think that this function should get revisited / refactored a bit. Comparing only the ptr is quite dangerous. For example, it's possible that the application can create a different shaped Ort tensor from the same buffer location -- so I think we should at least compared the ptr and the shape...