This repository was archived by the owner on Nov 27, 2024. It is now read-only.
Initial Float16 and BFloat16 onnx type support #31
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add initial support for ONNX native
Float16
andBFloat16
typesIssue: 30
The main advantage of these types will be in native code during inference, we do little work with tensors outside inference, the schedulers are the most work, but its light and I would no expect too much gain on CPU using these types.
So this PR will use the NodeMetaData from the model to create the correct type native memory buffers, so all inference will be in the models native type, once inference is complete we cast back to
float
In most cases we don't pass the output data into another session, TextEncoder, VaeEncode/Decoder just use the output and dispose, the scheduler steps however could benefit from staying in its native type, but that would require making the schedulers generic which would be quite a task, so I think perhaps measuring performance first may be a good option before getting too deep
TL;DR; Type support on native, cast back to
float
in managed