Skip to content

Extensions.AI: DataUriParser does not honour RFC 2397 default behaviour when media type omitted #7246

@Millmer

Description

@Millmer

Description

The Microsoft.Extensions.AI.Abstractions includes a DataUriParser used by DataContent to parse data URIs as part of ChatMessages. The parser is documented as being a minimal data URI parser based on RFC 2397 (see comment at the top of the class).

However, it does not conform to RFC 2397 when the media type is omitted. According to the RFC, omitting the media type should default to text/plain;charset=US-ASCII. Instead, the parser throws an exception when the media type is missing.

This prevents valid RFC-compliant data URIs from being parsed successfully, causing errors.

RFC Reference

RFC 2397 states:

If is omitted, it defaults to text/plain;charset=US-ASCII.

Example of valid URI per RFC:

data:;base64,77u/QWER...

The link to this very RFC doc is included in the comment inside the DataUriParser code itself:
https://datatracker.ietf.org/doc/html/rfc2397

Reproduction Steps

using Microsoft.Extensions.AI;

var uri = new Uri("data:;base64,SGVsbG8=");

var content = new DataContent(uri);

Expected behavior

Successfully parse using default media type text/plain;charset=US-ASCII if non is present between the data: and ;base64 tags

Actual behavior

Instead, we are receiving an error: uri did not contain a media type, and mediaType was not provided. (Parameter 'mediaType')

This comes from this exact line in the DataContent.cs#L99

Regression?

No response

Known Workarounds

Before passing the URI to the DataContent class, check if there is a missing media type between the data: and base64; tags and insert text/plain as per the RFC standard.

Configuration

.NET v10
C# c14
Microsoft.Extensions.AI v10.2.0

Other information

We discovered this because we are using the built-in parser from Dart in our Frontend which seems to conform to the RFC standard and therefore omits the text/plain mimetype when parsing txt files. We'd prefer not to have to hack in the text/plain ourselves and expect the DataUriParser to conform to the RFC standards as it says.

Metadata

Metadata

Labels

area-aiMicrosoft.Extensions.AI librariesbugThis issue describes a behavior which is not expected - a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions