-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: System.Buffers.Text.Base64Url #66841
Comments
Tagging subscribers to this area: @dotnet/area-system-buffers Issue DetailsBackground and motivationAlong with the well-known and oft-loved base64 encoding, IETF RFC 4648 § 5 also defines "Base 64 Encoding with URL and Filename Safe Alphabet," or base64url encoding. Base64Url is identical to the more common Base64 except that two characters are replaced so that every character is suitable for use in an URL or a filename on any platform. (
My proposal is to add API Proposalnamespace System.Buffers.Text
{
public static class Base64Url
{
public static unsafe OperationStatus EncodeToUtf8(ReadOnlySpan<byte> bytes, Span<byte> utf8, out int bytesConsumed, out int bytesWritten, bool isFinalBlock = true);
public static unsafe OperationStatus EncodeToUtf8InPlace(Span<byte> buffer, int dataLength, out int bytesWritten);
public static int GetMaxEncodedToUtf8Length(int length);
}
} Note that the above is identical to the design of API Usagestring inputString = "Hi?>>> This is a test string";
byte[] input = System.Text.Encoding.ASCII.GetBytes(inputString);
byte[] output;
byte[] inputBack;
// Base64 version (exists today)
output = new byte[System.Buffers.Text.Base64.GetMaxEncodedToUtf8Length(input.Length)];
System.Buffers.Text.Base64.EncodeToUtf8(input, output, out int _, out int _);
inputBack = new byte[System.Buffers.Text.Base64.GetMaxDecodedFromUtf8Length(output.Length)];
System.Buffers.Text.Base64.DecodeFromUtf8(output, inputBack, out int _, out int _);
// Base64Url version
output = new byte[System.Buffers.Text.Base64Url.GetMaxEncodedToUtf8Length(input.Length)];
System.Buffers.Text.Base64Url.EncodeToUtf8(input, output, out int _, out int _);
inputBack = new byte[System.Buffers.Text.Base64Url.GetMaxDecodedFromUtf8Length(output.Length)];
System.Buffers.Text.Base64Url.DecodeFromUtf8(output, inputBack, out int _, out int _); Alternative DesignsThe main alternative design here would be to extend the existing In addition, this proposed design does not include an interface for RisksRisk for this change seems extremely low. I chose to extract most of the code from
|
Here's my proposed implementation, along with tests and such. |
Your proposed implementation only works for lengths multiple of 4, correct? so you don't need to handle the padding, which is usually stripped from base64url after encoding and calculated and added after decoding. |
This duplicates #1658
You can also use https://www.nuget.org/packages/gfoidl.Base64/ which has vectorized paths for base64url too. |
Sorry, I don't follow. Is that a comment about the original algorithm? I'm not proposing changing the existing algorithm in |
Oh, wow, I don't know how I didn't see that... punctuation differences, I guess. I went with this implementation (a separate class) because it seemed easier to understand and more likely to maintain the original code's performance characteristics. |
@gfoidl — Honest question: Is there something productive that I can do today to help with getting this in? It seems like the design I'm proposing here isn't the one that people are leaning toward, so I should probably close this issue. I could attempt to port the work you've done in your NuGet package to the .NET codebase, but I fear that would be more annoying to you than helpful. The one thing that I can think of that might be most helpful is if I took my existing implementation, refactored it to use an "alphabet" parameter instead of a new class name per the #1658 design, and then made a PR for that. That way an implementation would make it in for .NET 7, in case the other work slips. But I get the sense that there are still some unresolved questions about padding and so perhaps even doing that would not be helpful. (Sorry, I've never contributed to .NET before, and I'm still unfamiliar with how things work around here. 🙂) |
No. Just wait for an area-owner to take over. Have a look at API Review Process to see how things work here in regards to new APIs.
Please wait for the API to be reviewed and potentially approved, then a PR can be created.
If I were you, then I'd close this issue here, and comment on #1658 with your ideas on the API shape to avoid duplicated and diverging thoughts -- which will make API review harder as it's split accross issues. There one should focus only on the API shape.
No need for "sorry". It's good you ask question, so we try to bring some light on it. |
Thanks Günther! Closing this one in favor of that one. |
Background and motivation
Along with the well-known and oft-loved base64 encoding, IETF RFC 4648 § 5 also defines "Base 64 Encoding with URL and Filename Safe Alphabet," or base64url encoding. Base64Url is identical to the more common Base64 except that two characters are replaced so that every character is suitable for use in an URL or a filename on any platform. (
+
is replaced with-
, and/
is replaced with_
.)System.Buffers.Text.Base64
includes methods for encoding binary data into base64 and decoding from base64, including with AVX2 and SSSE3 CPU optimizations, but no base64url. The alternatives are to run the text throughstring.Replace
first, which is silly, or reimplement the entire encoding and decoding algorithm, which is whatMicrosoft.AspNetCore.WebUtilities.WebEncoders.Base64UrlEncode
,Microsoft.IdentityModel.Tokens.Base64UrlEncoder
, and countless other implementations have done.My proposal is to add
System.Buffers.Text.Base64Url
as a high-performance companion toSystem.Buffers.Text.Base64
, using the same optimized algorithms, but with alternate encoding and decoding tables.API Proposal
Note that the above is identical to the design of
System.Buffers.Text.Base64
.API Usage
Alternative Designs
The main alternative design here would be to extend the existing
System.Buffers.Text.Base64
methods with an extra optional parameter that lets the caller decide between Base64 and Base64Url encoding, but that seems unnecessarily cumbersome.In addition, this proposed design does not include an interface for
Base64
andBase64Url
to implement, though one could be added since their signatures are identical.Risks
Risk for this change seems extremely low. I chose to extract most of the code from
Base64
into a new private classBase64Internal
and thenBase64
andBase64Url
simply forward calls intoBase64Internal
. The performance impact should simply be one function call on the stack. The algorithms used in the existing implementation are unchanged. After compilation, the binary size should be only slightly larger: just a couple of class definitions, and six new encoding and decoding tables totaling 416 bytes.The text was updated successfully, but these errors were encountered: