Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Dec 16, 2025

  • Create MediaTypeMap polyfill in LegacySupport directory with System.Net.Mime namespace
  • Add static LoadFromAsync methods to DataContent class:
    • Stream overload that accepts a Stream (infers name and mime type from FileStream's path if applicable)
    • String path overload that infers name and mime type from the file path
    • Pre-initialize MemoryStream capacity when source stream is seekable for performance
  • Add instance SaveToAsync method to DataContent class:
    • Handles directory paths by inferring filename from Name property or using random name
    • Always infers extension from media type if path has none
    • Returns the actual path where data was saved
    • Fix path traversal security issue by using Path.GetFileName(Name)
    • Use FileMode.CreateNew to prevent accidentally overwriting existing files
    • Document IOException when file already exists
  • Remove [Experimental] attribute and add methods to JSON API baseline as stable
  • Use GetBuffer() instead of ToArray() for MemoryStream efficiency
  • Add WriteAllBytesAsync(ReadOnlyMemory<byte>) polyfill to remove #ifdef from SaveToAsync
  • Change return types from Task<T> to ValueTask<T> for all new public async methods
  • Add C# 14 extension member polyfills for File.ReadAllBytesAsync and File.WriteAllBytesAsync in LegacySupport directory
  • Remove name parameter from LoadFromAsync(Stream) - Name can be assigned after the call
  • Add comprehensive unit tests for the new functionality
    • Add test for SaveToAsync with existing file to verify IOException is thrown and original content is preserved
    • Remove redundant tests per review feedback
    • Fix test names to accurately describe what they test
    • Use content.Name instead of hardcoding filename in assertions
    • Add tests for relative path scenarios (filename only, relative directory, empty string path)
  • Fix net462 build: Add InjectFilePolyfillsOnLegacy to test project
  • Add test for SaveToAsync with empty string path
  • Simplify extension inference by removing unnecessary null check
  • Update MarkItDownMcpReader.cs to use DataContent.LoadFromAsync with optional media type parameter
  • Add Microsoft.Extensions.AI.Abstractions project reference to MarkItDown project for LoadFromAsync access
  • Sync MediaTypeMap with latest .NET runtime source (added missing .cgm/image/cgm mapping)
  • Run code review and security checks
  • Switch components to use MediaTypeMap for extension/MIME type mappings:
    • ImageGeneratorExtensions.cs - Replace _extensionToMimeType dictionary with MediaTypeMap.GetMediaType
    • OpenAIClientExtensions.cs - Replace ImageUriToMediaType with MediaTypeMap.GetMediaType
    • OpenAIImageGenerator.cs - Replace _mimeTypeToExtension dictionary with MediaTypeMap.GetExtension
    • IngestionDocumentReader.cs - Remove custom lookup, only use MediaTypeMap.GetMediaType
Original prompt

This section details on the original issue you should resolve

<issue_title>We should have helper methods / extensions for DataContent</issue_title>
<issue_description>At a minimum we should have Load and Save.

Load should read from a path and preserve the file name loaded and try to decode a mime-type from that.

Save should save to a path, with an optional filename, using Name if a directory is provided in the path.

Perhaps we could consider a filename without extension to be provided and we infer the extension from the mime-type and return the saved path.

We should also have Stream based helpers.</issue_description>

<agent_instructions>Add static LoadFromAsync and instance SaveToAsync methods to DataContent. There should be a string path overload for loading that infers the name and mime type from the path (it should add a polyfill to common for https://github.com/dotnet/runtime/blob/main/src/libraries/System.Net.Mail/src/System/Net/Mime/MediaTypeMap.cs and use that to infer the mime type). There should also be a stream based one that should accept a Stream. If it's actually a FileStream it should default to inferring name and mime type from the filestream's path. It should fall back to octet stream if unknown. SaveAsync should accept a required path to save to; it should accept an optional bool parameter defaulting to true for inferExtension that will add an extension to the provided path if it doesn't have one, based on the mime type and MediaTypemap. It should return the path to which the data was saved. </agent_instructions>

Comments on the Issue (you are @copilot in this section)


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Microsoft Reviewers: Open in CodeFlow

Copilot AI changed the title [WIP] Add helper methods for DataContent with Load and Save Add LoadFromAsync and SaveToAsync helper methods to DataContent Dec 16, 2025
Copilot AI requested a review from stephentoub December 16, 2025 16:20
Copy link
Member

@stephentoub stephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Copilot AI commented Dec 16, 2025

@copilot

All review comments addressed in 840f61c. Changes include: MediaTypeMap moved to LegacySupport with full mappings, methods are now stable (not experimental), DefaultMediaType moved to DataContent as private const, using GetBuffer() for MemoryStream, proper null handling for Name, SaveToAsync handles directory paths and always infers extensions, and NET9_0_OR_GREATER conditional for ReadOnlyMemory overload.

Copilot AI requested a review from stephentoub December 16, 2025 17:15
@stephentoub stephentoub marked this pull request as ready for review December 16, 2025 17:16
@stephentoub stephentoub requested review from a team as code owners December 16, 2025 17:16
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds helper methods to DataContent for loading from and saving to file system paths and streams, with automatic media type (MIME type) inference based on file extensions.

Key Changes

  • Adds static LoadFromAsync methods (path and stream overloads) that infer media types and names from file extensions
  • Adds instance SaveToAsync method that handles directory paths and extension inference
  • Implements MediaTypeMap polyfill for .NET 9 and earlier to enable file extension-to-MIME type mapping
  • Marks the new APIs as stable (non-experimental) in the JSON baseline

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/Libraries/Microsoft.Extensions.AI.Abstractions/Contents/DataContent.cs Implements the new LoadFromAsync and SaveToAsync methods with media type inference logic
src/LegacySupport/MediaTypeMap/MediaTypeMap.cs Adds polyfill implementation of media type mapping for pre-.NET 10 frameworks
src/LegacySupport/MediaTypeMap/README.md Documents the MediaTypeMap polyfill purpose and origin
eng/MSBuild/LegacySupport.props Configures MSBuild to inject MediaTypeMap polyfill for legacy target frameworks
src/Libraries/Microsoft.Extensions.AI.Abstractions/Microsoft.Extensions.AI.Abstractions.csproj Enables MediaTypeMap injection for the abstractions project
src/Libraries/Microsoft.Extensions.AI.Abstractions/Microsoft.Extensions.AI.Abstractions.json Adds the new methods to the stable API baseline
test/Libraries/Microsoft.Extensions.AI.Abstractions.Tests/Contents/DataContentTests.cs Adds comprehensive unit tests covering all new functionality and edge cases

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the rest of the tests for accuracy / redundancy.

@stephentoub
Copy link
Member

@copilot please address all the tests feedback

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 23, 2026

@copilot please address all the tests feedback

Fixed in 5267589. Addressed all test feedback:

  • Renamed SaveToAsync_WithEmptyPath_UsesCurrentDirectory to SaveToAsync_WithJustFilename_SavesInCurrentDirectory and made it actually test saving with just a filename
  • Removed redundant tests (LoadFromAsync_AbsolutePath_LoadsCorrectly, LoadFromAsync_RelativePath_LoadsCorrectly, SaveToAsync_AbsolutePath_SavesCorrectly, SaveToAsync_ToSubdirectory_SavesCorrectly, SaveToAsync_ToNestedSubdirectory_SavesCorrectly, SaveToAsync_EmptyPath_WithoutName_GeneratesRandomName)
  • Renamed LoadFromAsync_RelativePathWithSubdirectory_LoadsCorrectly to LoadFromAsync_ExtractsFilenameFromPath with accurate description
  • Used content.Name instead of hardcoding filename in SaveToAsync_ExistingDirectory_WithName_UsesNameAsFilename

@stephentoub stephentoub requested a review from ericstj January 23, 2026 20:45
@stephentoub stephentoub enabled auto-merge (squash) January 23, 2026 21:20
@ericstj
Copy link
Member

ericstj commented Jan 23, 2026

Looking now

Copy link
Member

@ericstj ericstj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is good, just had a suggesting to improve test coverage.

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
auto-merge was automatically disabled January 23, 2026 23:04

Head branch was pushed to by a user without write access

@stephentoub
Copy link
Member

@copilot is there a test for saving when there's no file name or directory specified (and it should use current directory with name from content)?

Copy link
Contributor Author

Copilot AI commented Jan 24, 2026

@copilot is there a test for saving when there's no file name or directory specified (and it should use current directory with name from content)?

Yes, SaveToAsync_WithEmptyPath_UsesCurrentDirectoryAndContentName (lines 621-646) tests this case. It passes string.Empty to SaveToAsync and verifies that the file is saved to the current directory using content.Name as the filename.

@stephentoub stephentoub merged commit d9ddaaa into main Jan 24, 2026
7 checks passed
@stephentoub stephentoub deleted the copilot/add-helper-methods-data-content branch January 24, 2026 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-ai Microsoft.Extensions.AI libraries

Projects

None yet

Development

Successfully merging this pull request may close these issues.

We should have helper methods / extensions for DataContent

5 participants