-
Notifications
You must be signed in to change notification settings - Fork 850
Description
Description
The Microsoft.Extensions.AI.Abstractions includes a DataUriParser used by DataContent to parse data URIs as part of ChatMessages. The parser is documented as being a minimal data URI parser based on RFC 2397 (see comment at the top of the class).
However, it does not conform to RFC 2397 when the media type is omitted. According to the RFC, omitting the media type should default to text/plain;charset=US-ASCII. Instead, the parser throws an exception when the media type is missing.
This prevents valid RFC-compliant data URIs from being parsed successfully, causing errors.
RFC Reference
RFC 2397 states:
If is omitted, it defaults to text/plain;charset=US-ASCII.
Example of valid URI per RFC:
data:;base64,77u/QWER...The link to this very RFC doc is included in the comment inside the DataUriParser code itself:
https://datatracker.ietf.org/doc/html/rfc2397
Reproduction Steps
using Microsoft.Extensions.AI;
var uri = new Uri("data:;base64,SGVsbG8=");
var content = new DataContent(uri);Expected behavior
Successfully parse using default media type text/plain;charset=US-ASCII if non is present between the data: and ;base64 tags
Actual behavior
Instead, we are receiving an error: uri did not contain a media type, and mediaType was not provided. (Parameter 'mediaType')
This comes from this exact line in the DataContent.cs#L99
Regression?
No response
Known Workarounds
Before passing the URI to the DataContent class, check if there is a missing media type between the data: and base64; tags and insert text/plain as per the RFC standard.
Configuration
.NET v10
C# c14
Microsoft.Extensions.AI v10.2.0
Other information
We discovered this because we are using the built-in parser from Dart in our Frontend which seems to conform to the RFC standard and therefore omits the text/plain mimetype when parsing txt files. We'd prefer not to have to hack in the text/plain ourselves and expect the DataUriParser to conform to the RFC standards as it says.