-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[OpenAI] Audio Outputs #1136
base: main
Are you sure you want to change the base?
[OpenAI] Audio Outputs #1136
Conversation
Codecov ReportAttention: Patch coverage is
❗ Your organization needs to install the Codecov GitHub app to enable full functionality. Additional details and impacted files@@ Coverage Diff @@
## main #1136 +/- ##
==========================================
+ Coverage 43.44% 48.46% +5.01%
==========================================
Files 72 78 +6
Lines 5849 5586 -263
==========================================
+ Hits 2541 2707 +166
+ Misses 3308 2879 -429 ☔ View full report in Codecov by Sentry. |
Apologies if I missed it in the PR, but @hudson-ai do you have a small code snippet I can run to play with the feature? |
from guidance import *
lm = models.OpenAI("gpt-4o-audio-preview")
with system():
lm += "Talk like a pirate."
with user():
lm += "What is the capital of France?"
with assistant():
lm += gen_audio() @Harsha-Nori give this a shot! Not without bugs at the moment 😜 |
This PR adds support for
gen_audio
(potentially not the final public API) for OpenAI audio-enabled models.Note that this involves a pretty significant refactor of the initial implementation of "State" objects introduced a PR or two ago and centralizes handling of different input/output formats in
Client
classes. They may need to be renamed now that they do more heavy lifting. But in all, I think this is far more flexible and easy to extend.@nopdive this also paved the way to re-introduce token probabilities in a pretty full capacity -- will continue working on that in the coming days.