Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OpenAI] Audio Outputs #1136

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open

Conversation

hudson-ai
Copy link
Collaborator

This PR adds support for gen_audio (potentially not the final public API) for OpenAI audio-enabled models.

Note that this involves a pretty significant refactor of the initial implementation of "State" objects introduced a PR or two ago and centralizes handling of different input/output formats in Client classes. They may need to be renamed now that they do more heavy lifting. But in all, I think this is far more flexible and easy to extend.

@nopdive this also paved the way to re-introduce token probabilities in a pretty full capacity -- will continue working on that in the coming days.

@codecov-commenter
Copy link

codecov-commenter commented Feb 28, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 53.74332% with 173 lines in your changes missing coverage. Please review.

Project coverage is 48.46%. Comparing base (9fe8b26) to head (8f7b94d).
Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
guidance/models/_openai.py 31.69% 125 Missing ⚠️
guidance/models/_engine/_client.py 58.00% 21 Missing ⚠️
guidance/models/_base/_client.py 75.67% 9 Missing ⚠️
guidance/_ast.py 84.37% 5 Missing ⚠️
guidance/library/_image.py 28.57% 5 Missing ⚠️
guidance/models/_base/_model.py 91.17% 3 Missing ⚠️
guidance/models/_base/_state.py 87.50% 2 Missing ⚠️
guidance/models/transformers/_model.py 60.00% 2 Missing ⚠️
guidance/library/_audio.py 66.66% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1136      +/-   ##
==========================================
+ Coverage   43.44%   48.46%   +5.01%     
==========================================
  Files          72       78       +6     
  Lines        5849     5586     -263     
==========================================
+ Hits         2541     2707     +166     
+ Misses       3308     2879     -429     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Harsha-Nori
Copy link
Member

Apologies if I missed it in the PR, but @hudson-ai do you have a small code snippet I can run to play with the feature?

@hudson-ai
Copy link
Collaborator Author

hudson-ai commented Feb 28, 2025

from guidance import *

lm = models.OpenAI("gpt-4o-audio-preview")
with system():
    lm += "Talk like a pirate."
with user():
    lm += "What is the capital of France?"
with assistant():
    lm += gen_audio()

@Harsha-Nori give this a shot! Not without bugs at the moment 😜

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants