Skip to content
This repository was archived by the owner on Nov 27, 2024. It is now read-only.

Initial Float16 and BFloat16 onnx type support #31

Merged
merged 1 commit into from
Nov 16, 2023
Merged

Conversation

saddam213
Copy link
Member

@saddam213 saddam213 commented Nov 16, 2023

Add initial support for ONNX native Float16 and BFloat16 types
Issue: 30

The main advantage of these types will be in native code during inference, we do little work with tensors outside inference, the schedulers are the most work, but its light and I would no expect too much gain on CPU using these types.

So this PR will use the NodeMetaData from the model to create the correct type native memory buffers, so all inference will be in the models native type, once inference is complete we cast back to float

In most cases we don't pass the output data into another session, TextEncoder, VaeEncode/Decoder just use the output and dispose, the scheduler steps however could benefit from staying in its native type, but that would require making the schedulers generic which would be quite a task, so I think perhaps measuring performance first may be a good option before getting too deep

TL;DR; Type support on native, cast back to float in managed

@saddam213
Copy link
Member Author

Tested:
StableDiffusiom 1.5
LCM-Dreamshaper-V7
LCM-Dreamshaper-V7 f16
Photon

@Amin456789
Copy link

nice!
posted some more new fp16 olive optimized here if u need in testing them
#14

@Amin456789
Copy link

@saddam213 saddam213 closed this Nov 16, 2023
@saddam213 saddam213 reopened this Nov 16, 2023
@saddam213 saddam213 merged commit c1aa906 into master Nov 16, 2023
@saddam213 saddam213 deleted the OnnxNativeTypes branch November 16, 2023 19:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants