Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use RecyclableMemoryStream #16949

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Use RecyclableMemoryStream #16949

wants to merge 16 commits into from

Conversation

MikeAlhayek
Copy link
Member

Fix #16946

@sebastienros I am carious to see is RecyclableMemoryStream will have an impact on the project. If you have time to do a comparison that would be awesome.

Fix #16946

Related Work Items: #169
@hishamco
Copy link
Member

hishamco commented Nov 4, 2024

I was waiting for Seb reply on the issue to let the author propose the PR, but seems you do it quickly :)

Copy link
Member

@sebastienros sebastienros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am very familiar with this API and I don't think we have many actual usages of it. At least not by replacing where we already use MemoryStream, I can explain at Tuesday's meeting.

Copy link
Member

@sebastienros sebastienros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments but LGTM

@@ -0,0 +1,11 @@
using Microsoft.IO;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have an actual infrastructure package that is no "abstractions" or a common project where we put helpers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is Infrastructure project. but not all of these projects depend on Infrastructure they depend on Infrastructure.Abstractions` I am also not happy to add this to the abstraction project. I am open to better placement.

Copy link
Contributor

github-actions bot commented Nov 7, 2024

This pull request has merge conflicts. Please resolve those before requesting a review.

@@ -57,37 +59,24 @@ internal static bool IsCompressed(byte[] data)
return false;
}

internal static byte[] Compress(byte[] data)
internal static ReadOnlySpan<byte> Compress(byte[] data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work.

The memory is taken from the pool on GetStream() and written into by CopyTo(). And you correctly get the contiguous buffer with GetBuffer().Slice().

However by returning this slice to the caller, you also dispose the stream. Adn at that point the pooled buffer will be returned for another client to use it. If you are the only one on the machine then you won't see an issue. But if there is another request trying to pool some memory, they will then write in the same place you are reading the compressed data from.

The solution here is to pass the stream to this method, and the caller to use MemoryStreamFactory. -> Compress(byte[] data, Stream destinationStream)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the code, please check it out again

Comment on lines 14 to 22
var neededLength = GetByteArrayLengthFromBase64(encoded);
Span<byte> bytes = new byte[neededLength];
if (!Convert.TryFromBase64String(encoded, bytes, out var bytesWritten))
{
throw new FormatException("Invalid Base64 string.");
}
return Encoding.UTF8.GetString(bytes);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var neededLength = GetByteArrayLengthFromBase64(encoded);
Span<byte> bytes = new byte[neededLength];
if (!Convert.TryFromBase64String(encoded, bytes, out var bytesWritten))
{
throw new FormatException("Invalid Base64 string.");
}
return Encoding.UTF8.GetString(bytes);
var neededLength = GetByteArrayLengthFromBase64(encoded);
using var memoryStream = MemoryStreamFactory.GetStream(neededLength);
var span = memoryStream.GetSpan(neededLength);
if (!Convert.TryFromBase64String(encoded, span, out var bytesWritten))
{
throw new FormatException("Invalid Base64 string.");
}
return Encoding.UTF8.GetString(span.Slice(0, bytesWritten));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sebastienros look at the helpers that was added. It's doing the same logic as this now. Do we need to pass neededLength in GetStream() and also GetSpan()? maybe only in GetSpan(), right?

@@ -11,7 +11,15 @@ public class CommonGeneratorMethods : IGlobalMethodProvider
Name = "base64",
Method = serviceProvider => (Func<string, string>)(encoded =>
{
return Encoding.UTF8.GetString(Convert.FromBase64String(encoded));
var neededLength = GetByteArrayLengthFromBase64(encoded);
Span<byte> bytes = new byte[neededLength];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be the same as the code on the left. But the code on the left is one line.

using var gzip = new GZipStream(new MemoryStream(bytes), CompressionMode.Decompress);
var neededLength = GetByteArrayLengthFromBase64(encoded);
Span<byte> bytes = new byte[neededLength];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update with the GetSpan example

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a reusable helper for this logic.

throw new FormatException("Invalid Base64 string.");
}
using var stream = MemoryStreamFactory.GetStream();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And when using GetSpan this becomes unnecessary because the conversion is done directly on the stream.

// The formula to calculate the number of bytes from the base64 string length is:
private static int GetByteArrayLengthFromBase64(string base64String)
{
return (base64String.Length * 3) / 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NB: it's important to do the multiplication first

image


public static class Str
{
public static ReadOnlySpan<byte> FromBase64String(string base64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't do that. I explained earlier.


return output.ToArray();
return (int)gZip.Length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary or useful. The stream has a Length property.
We talked about need to know how much was written for the GetSpan case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use Microsoft.IO.RecyclableMemoryStream
3 participants