[OpenAI] Audio Outputs #1136

hudson-ai · 2025-02-28T05:28:37Z

This PR adds support for gen_audio (potentially not the final public API) for OpenAI audio-enabled models.

Note that this involves a pretty significant refactor of the initial implementation of "State" objects introduced a PR or two ago and centralizes handling of different input/output formats in Client classes. They may need to be renamed now that they do more heavy lifting. But in all, I think this is far more flexible and easy to extend.

@nopdive this also paved the way to re-introduce token probabilities in a pretty full capacity -- will continue working on that in the coming days.

…re flexibility

codecov-commenter · 2025-02-28T05:36:56Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 53.74332% with 173 lines in your changes missing coverage. Please review.

Project coverage is 48.46%. Comparing base (9fe8b26) to head (8f7b94d).
Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
guidance/models/_openai.py	31.69%	125 Missing ⚠️
guidance/models/_engine/_client.py	58.00%	21 Missing ⚠️
guidance/models/_base/_client.py	75.67%	9 Missing ⚠️
guidance/_ast.py	84.37%	5 Missing ⚠️
guidance/library/_image.py	28.57%	5 Missing ⚠️
guidance/models/_base/_model.py	91.17%	3 Missing ⚠️
guidance/models/_base/_state.py	87.50%	2 Missing ⚠️
guidance/models/transformers/_model.py	60.00%	2 Missing ⚠️
guidance/library/_audio.py	66.66%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1136      +/-   ##
==========================================
+ Coverage   43.44%   48.46%   +5.01%     
==========================================
  Files          72       78       +6     
  Lines        5849     5586     -263     
==========================================
+ Hits         2541     2707     +166     
+ Misses       3308     2879     -429

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Harsha-Nori · 2025-02-28T06:31:45Z

Apologies if I missed it in the PR, but @hudson-ai do you have a small code snippet I can run to play with the feature?

hudson-ai · 2025-02-28T06:41:16Z

from guidance import *

lm = models.OpenAI("gpt-4o-audio-preview")
with system():
    lm += "Talk like a pirate."
with user():
    lm += "What is the capital of France?"
with assistant():
    lm += gen_audio()

@Harsha-Nori give this a shot! Not without bugs at the moment 😜

…owed

hudson-ai added 12 commits February 27, 2025 15:05

have client directly manipulate state and only yield data for vis

6c818fb

move way more responsibility out of state and into client -- gives mo…

9b02bfa

…re flexibility

base State no longer provides text

f74bb93

allow remote image urls

c63f8fb

closer to correct handling of input traces

400c1a0

return type

eedcced

pydantic for OpenAI

e6823fd

use stream context manager

b6ab4aa

OpenAI audio support (mostly)

bb61a7f

input -> is_input

f13763e

comment on EngineClient's missing probs

e090eae

wav output

ff1c516

hudson-ai requested review from nking-1, Harsha-Nori and nopdive February 28, 2025 05:28

hudson-ai added 5 commits February 28, 2025 08:54

fix local image handling

5b3f6b8

fix shared mutable active blocks

2f15852

fix llamacpp chat template test

2024007

Merge branch 'main' into state_refactor

e94227e

test mis-spec, triggered failure when producing special tokens disall…

8f7b94d

…owed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[OpenAI] Audio Outputs #1136

[OpenAI] Audio Outputs #1136

hudson-ai commented Feb 28, 2025

codecov-commenter commented Feb 28, 2025 •

edited

Loading

Harsha-Nori commented Feb 28, 2025

hudson-ai commented Feb 28, 2025 •

edited

Loading

[OpenAI] Audio Outputs #1136

Are you sure you want to change the base?

[OpenAI] Audio Outputs #1136

Conversation

hudson-ai commented Feb 28, 2025

codecov-commenter commented Feb 28, 2025 • edited Loading

Codecov Report

Harsha-Nori commented Feb 28, 2025

hudson-ai commented Feb 28, 2025 • edited Loading

codecov-commenter commented Feb 28, 2025 •

edited

Loading

hudson-ai commented Feb 28, 2025 •

edited

Loading