-
Notifications
You must be signed in to change notification settings - Fork 2k
Review API design of org.springframework.ai.image (#326) #4736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…nseFormat (spring-projects#326) Signed-off-by: Kuntal Maity <kuntal.1461@gmail.com>
…nseFormat (spring-projects#326) Signed-off-by: Kuntal Maity <kuntal.1461@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I would typically be in favor of such strongly typed changes, I wonder if this is really desirable.
Do all providers really agree on values? For example, you changed one test value from base64 to b64_json. Was it intentional? Or was it an oversight exactly because a strong type gives you confidence?
Moreover, do all providers really support (and will they always) support all values of the enum?
| .width(1920) | ||
| .style("sketch") | ||
| .responseFormat("base64") | ||
| .responseFormat(ImageResponseFormat.B64_JSON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a change of semantics here
| @Override | ||
| public String getResponseFormat() { | ||
| return this.responseFormat; | ||
| return (this.responseFormat != null) ? this.responseFormat.getValue() : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming we carry through with this change (see my overall comment on the benefits of this PR), could we try to push the null check inside the enum (for example having ImageResponseFormat.NULL a value, and setting all options to that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ericbottard I wouldn’t introduce a ImageResponseFormat.NULL sentinel. Today, if the caller never sets a response format we just return null, and Jackson simply omits response_format from the payload—which is exactly what the provider expects. If we replaced that with an enum constant whose getValue() returns "null", we’d end up serializing response_format: "null" unless every caller remembered to special-case the sentinel, which is both error-prone and not part of the provider’s spec. Keeping the null check where we materialize the request is much clearer; if we want to tighten the API surface, a better direction would be to expose an Optional rather than distort the enum domain.
Can you confirm if this approach matches what you’d expect?
|
Hi @ericbottard Thanks for calling that out—yes, the base64 → b64_json change was intentional. OpenAI (and Azure’s wrapper) return the literal b64_json token, so the enum would never round‑trip if we left the old string. The enum isn’t asserting that every provider supports every value; each integration only ever sets the members its API understands (e.g. Stability AI uses the PNG/JSON entries, OpenAI uses URL/b64_json). I do share the concern about future formats—if someone ships a new token we’ll need to extend the enum quickly—but the stronger typing gives us an earlier signal when that happens. OpenAI’s image API docs list the response_format choices explicitly: url or b64_json. You can see it here (scroll to the request parameters table): https://platform.openai.com/docs/api-reference/images/create Azure OpenAI mirrors OpenAI’s parameters. The CreateImage endpoint documents response_format taking url or b64_json as well: https://learn.microsoft.com/azure/ai-services/openai/reference#image-generation-create. |
Summary
This PR contributes to issue #326 by improving the API design of org.springframework.ai.image:
Introduces a new ImageResponseFormat enum for strongly typed response formats.
Aligns ImageOptions across providers (OpenAI, Azure OpenAI, Stability, ZhiPuAi).
Updates tests accordingly.
Enhances documentation to reflect the new type-safe format property.
Ensures observation and response metadata consistency.
This aligns image models with improved usability and type safety.
Testing