Skip to content

Python: Add file handling support to BinaryContent for OpenAI Responses API #12258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ltwlf
Copy link
Contributor

@ltwlf ltwlf commented May 23, 2025

Summary

Enhances BinaryContent to support file handling for OpenAI Responses API, enabling file uploads through the responses agent while maintaining a provider-agnostic design.

Changes

BinaryContent Enhancements

  • Add can_read property: Indicates whether content has readable data available
  • Add from_file() class method: Creates BinaryContent instances from file paths with automatic base64 encoding
  • Fix Unicode handling: Prevents decode errors when processing binary files (PDFs, images, etc.)

OpenAI Responses Agent Integration

  • Add BinaryContent support: Pattern matching case for file handling in responses_agent_thread_actions.py
  • Correct OpenAI API format: Uses proper filename and file_data structure with data URI format
  • UUID-based filenames: Generates appropriate filenames with mime-type extensions
  • Provider-specific mapping: File format conversion happens only in OpenAI agent code

Testing

  • Test coverage: New tests for can_read, from_file(), and binary data handling
  • Unicode error prevention: Specific test for binary PDF-like content
  • Base64 encoding verification: Ensures proper data format for API compatibility

Design Principles

  • Provider-agnostic: BinaryContent remains completely generic with no OpenAI-specific dependencies
  • Clean separation: OpenAI format mapping isolated to OpenAI agent files
  • No FileContent class: Enhances existing BinaryContent instead of introducing new types
  • Follows existing patterns: Similar approach to ImageContent and TextContent handling

Usage Example

response = await self.agent.get_response(
      messages=ChatMessageContent(
          content="Analyse PDF",
          role=AuthorRole("user"),
          items=[BinaryContent.from_file(file_path=c":/test.pdf")],
      )
  )

@markwallace-microsoft markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label May 23, 2025
@ltwlf ltwlf marked this pull request as ready for review May 23, 2025 15:03
@ltwlf ltwlf requested a review from a team as a code owner May 23, 2025 15:03
@ltwlf ltwlf changed the title Python: Add FileContent class with comprehensive file handling support Python: Add FileContent class for OpenAI Responses API file handling May 23, 2025
@eavanvalkenburg
Copy link
Member

Haven't looked yet, but why can't we use BinaryContent for this?

@ltwlf
Copy link
Contributor Author

ltwlf commented May 23, 2025

Binary is meant to be abstract (mentioned in code comments). ImageContent and FileContent derive from it. It also needed mapping to match open ai api schema spec for pdf file as base64. FileContent adapts the existing ImageContent implementation.

@ltwlf ltwlf force-pushed the feature/file-content branch 2 times, most recently from 2a0a1f7 to f53e50c Compare May 24, 2025 13:51
@moonbox3
Copy link
Contributor

moonbox3 commented May 26, 2025

Hi @ltwlf, can you please check your Ruff settings? There are a lot of extra line adds and such in this PR, that aren't necessary -- most of the initial changes in I see responses_agent_thread_actions for example. It looks like you have some settings overriding our pyproject.toml Ruff settings that configure the line-length as 120?

The current code has the following following 99 chars (allowed because < 120):

from openai.types.responses.response_content_part_added_event import ResponseContentPartAddedEvent

and you're trying to commit the change that brings the first import line to 71 chars:

from openai.types.responses.response_content_part_added_event import (
    ResponseContentPartAddedEvent,
)

@ltwlf ltwlf force-pushed the feature/file-content branch 9 times, most recently from 991b27a to 3da6b35 Compare May 26, 2025 07:31
@ltwlf
Copy link
Contributor Author

ltwlf commented May 26, 2025

@moonbox3 thanks for the feedback and you are right about the unnecessary formatting changes.

I tracked down the issue - the root .vscode/settings.json was using autopep8 instead of ruff, which caused imports to be formatted differently than the project's pyproject.toml configuration.

I've fixed the VSCode settings and reverted all the unrelated formatting changes. The PR now only contains the minimal changes needed for FileContent functionality, and future contributions won't have this formatting issue.

@eavanvalkenburg
Copy link
Member

@ltwlf we had a chat internally, we will eventually align the design of this with the dotnet Microsoft.Extensions.AI design, which only has BinaryContent without subclasses, so we would prefer to update BinaryContent instead of introducing something new that we then have to remove again, could you adapt this PR to do that (or create a new one)? Should be mostly adapting some of the docstrings in binarycontent and then using that instead of the filecontent for OpenAI Responses! Thanks for your efforts though, much appreciated!

@ltwlf ltwlf force-pushed the feature/file-content branch 2 times, most recently from 73a0e0d to 9558303 Compare May 26, 2025 07:48
@ltwlf
Copy link
Contributor Author

ltwlf commented May 26, 2025

@eavanvalkenburg thanks for the feedback! To clarify, are you suggesting that the binary type should handle OpenAI file data by default, with ImageContent inheriting from binary and overriding methods specifically for image handling?

@ltwlf ltwlf force-pushed the feature/file-content branch 2 times, most recently from 196c926 to bc8e221 Compare May 26, 2025 08:46
@eavanvalkenburg
Copy link
Member

@ltwlf exactly, eventually we might remove the ImageContent altogether, but that is breaking so we won't any time soon.

Copy link
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of small comments, overall it looks good. Do fix the linting settings because a lot of changes come from that which is annoying!

@ltwlf ltwlf force-pushed the feature/file-content branch from bc8e221 to 2efa0f5 Compare May 26, 2025 08:54
@ltwlf ltwlf marked this pull request as draft May 26, 2025 08:57
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented May 26, 2025

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
agents/open_ai
   responses_agent_thread_actions.py36412665%169, 186, 197, 208–209, 376–378, 392, 397, 404, 422–423, 427, 471–473, 476, 494–495, 504–506, 510, 514–515, 519–521, 532–534, 556, 558–559, 566–567, 569–570, 572, 574–575, 577–578, 661–663, 675–676, 678–682, 684–690, 692–701, 703–704, 706–709, 716–718, 723–724, 726–731, 733, 737–738, 740, 744, 748, 753–754, 759, 762–763, 768, 770, 829–832, 873–874, 878–880, 882, 907–908, 910–914, 916, 926–927, 935, 939, 946, 952, 1016
contents
   binary_content.py1331588%99, 120, 128, 134, 147, 152–153, 204–205, 235–240
TOTAL27052459983% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3624 18 💤 0 ❌ 0 🔥 1m 50s ⏱️

@ltwlf ltwlf force-pushed the feature/file-content branch 3 times, most recently from 8d1f9f6 to fbc1c12 Compare May 26, 2025 09:36
Copy link
Member

@eavanvalkenburg eavanvalkenburg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for the effort!

@ltwlf ltwlf force-pushed the feature/file-content branch 2 times, most recently from fb0c094 to 11ae8bf Compare May 26, 2025 09:44
@ltwlf ltwlf changed the title Python: Add FileContent class for OpenAI Responses API file handling Python: Add file handling support to BinaryContent for OpenAI Responses API May 26, 2025
@ltwlf ltwlf marked this pull request as ready for review May 26, 2025 09:51
@ltwlf
Copy link
Contributor Author

ltwlf commented May 26, 2025

@eavanvalkenburg now it is ready to review :)

@ltwlf ltwlf force-pushed the feature/file-content branch from 11ae8bf to 8f2363f Compare May 26, 2025 09:56
@ltwlf ltwlf force-pushed the feature/file-content branch 2 times, most recently from bf8ef4b to c13deb9 Compare May 27, 2025 08:32
@ltwlf ltwlf force-pushed the feature/file-content branch from c13deb9 to 1eed7a9 Compare May 27, 2025 15:59
@ltwlf ltwlf requested a review from moonbox3 May 27, 2025 22:33
@ltwlf ltwlf force-pushed the feature/file-content branch 2 times, most recently from e55e7b4 to 8e1af11 Compare May 28, 2025 14:34
@ltwlf
Copy link
Contributor Author

ltwlf commented May 28, 2025

I fixed some lint issues. The pre-commit hooks in my project weren’t working, so I reinstalled them.

@ltwlf ltwlf force-pushed the feature/file-content branch from 8e1af11 to 39f5a77 Compare May 29, 2025 06:08
@ltwlf
Copy link
Contributor Author

ltwlf commented May 30, 2025

@moonbox3 – I’ve added the examples you requested. Could you take another look when you have a moment? I’m a bit concerned we might run into merge conflicts if this PR stays open much longer. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation python Pull requests for the Python Semantic Kernel
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants