-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add Stream
wrappers for memory and text-based types
#82801
Comments
Tagging subscribers to this area: @dotnet/area-system-memory Issue DetailsBackground and motivationThis is meant to bundle #46663, #27156, and #22838. There's the usual need of wrapping memory and text-based types in a Stream and given that there are multiple implementations spread across different libraries, it would be ideal to have a standardized API for it directly on .NET. A few examples of the current options available: Nerdbank.Streams also provides AsStream() extensions for the following: While I'm not certain of providing support for all the types listed above, I think a good starting point would be the following based on the evidence shown in the issues I bundled. API Proposalnamespace System.IO
{
public static class StreamTextExtensions
{
public static Stream AsStream(this ReadOnlyMemory<char> instance, Encoding encoding) { }
public static Stream AsStream(this string instance, Encoding encoding) { }
}
public static class StreamMemoryExtensions
{
public static Stream AsStream(this Memory<byte> instance) { }
public static Stream AsStream(this ReadOnlyMemory<byte> instance) { }
public static Stream AsStream(this ReadOnlySequence<byte> instance){ }
}
} API UsageExample 1: static HttpClient client = new HttpClient();
static void SendString(string str)
{
var request = new HttpRequestMessage(HttpMethod.Post, "contoso.com");
request.Content = new StreamContent(str.AsStream());
client.Send(request);
} Example 2: static unsafe void DoStreamOverSpan(ReadOnlySpan<byte> span)
{
fixed (byte* ptr = &MemoryMarshal.GetReference(span))
{
using (MemoryManager<byte> manager = new PointerMemoryManager<byte>(ptr, span.Length))
{
using Stream s = ((ReadOnlyMemory<byte>)manager.Memory).AsStream();
// Do something with it...
}
}
} Alternative DesignsWe can think of providing explicit types instead of simply returning Stream, but I like this way more since the Stream types would be internal and could make it more flexible and maintainable for future improvements e.g: we could start returning a RisksNo response
|
Tagging subscribers to this area: @dotnet/area-system-io Issue DetailsBackground and motivationThis is meant to bundle #46663, #27156, and #22838. There's the usual need of wrapping memory and text-based types in a Stream and given that there are multiple implementations spread across different libraries, it would be ideal to have a standardized API for it directly on .NET. A few examples of the current options available: Nerdbank.Streams also provides AsStream() extensions for the following: While I'm not certain of providing support for all the types listed above, I think a good starting point would be the following based on the evidence shown in the issues I bundled. API Proposalnamespace System.IO
{
public static class StreamTextExtensions
{
public static Stream AsStream(this ReadOnlyMemory<char> instance, Encoding encoding) { }
public static Stream AsStream(this string instance, Encoding encoding) { }
}
public static class StreamMemoryExtensions
{
public static Stream AsStream(this Memory<byte> instance) { }
public static Stream AsStream(this ReadOnlyMemory<byte> instance) { }
public static Stream AsStream(this ReadOnlySequence<byte> instance){ }
}
} API UsageExample 1: static HttpClient client = new HttpClient();
static void SendString(string str)
{
var request = new HttpRequestMessage(HttpMethod.Post, "contoso.com");
request.Content = new StreamContent(str.AsStream());
client.Send(request);
} Example 2: static unsafe void DoStreamOverSpan(ReadOnlySpan<byte> span)
{
fixed (byte* ptr = &MemoryMarshal.GetReference(span))
{
using (MemoryManager<byte> manager = new PointerMemoryManager<byte>(ptr, span.Length))
{
using Stream s = ((ReadOnlyMemory<byte>)manager.Memory).AsStream();
// Do something with it...
}
}
} Alternative DesignsWe can think of providing explicit types instead of simply returning Stream, but I like this way more since the Stream types would be internal and could make it more flexible and maintainable for future improvements e.g: we could start returning a RisksNo response
|
I don't think that making a stream from a string (or byte buffer) is common enough to warrant an extension method. I'd rather see it as static non-extension methods on Stream, such as public partial class Stream
{
public static Stream FromText(string text, Encoding encoding);
public static Stream FromText(ReadOnlyMemory<char> text, Encoding encoding);
public static Stream FromData(byte[] data);
public static Stream FromData(ReadOnlyMemory<byte> data);
public static Stream FromData(ReadOnlySequence<byte> data);
} |
I don't know how common is too common but the top StackOverflow result has 1.2k upvotes in the top answer and is quite suboptimal. public static Stream GenerateStreamFromString(string s)
{
var stream = new MemoryStream();
var writer = new StreamWriter(stream);
writer.Write(s);
writer.Flush();
stream.Position = 0;
return stream;
} |
When it comes to converting text to a Stream, I don't know that your anti-sample is all that bad. You'll have to encode the text either way. I suppose you could encode it in a streaming fashion instead of all up front, but that would add a great deal of complexity, and for a string (which shouldn't be too large in the common case) the complexity may make it slower. |
@AArnott that's fair. It's possible that encoding on the go is not more optimal in all scenarios, especially on small strings as you say. |
Agree. And the extension methods are under such a common namespace |
Stream
wrappers for memory and text-based types
Since most common cases will be with utf8 strings and since we ha e utf8 literals I think another |
How does that differ from MemoryStream if you already have a byte[], or the proposed FromMemory if you have a ROM? I'm not sure how to interpret the string / char-based input, as those aren't from u8 literals and aren't utf8. |
Maybe I wasn't clear enough, I am not talking about the FromMemory methods, you are right, that it is easy to just use those after getting the bytes for a utf8 string. I am referring to the two FromText methods and I meant having a FromUtf8Text method overloads that don't require an encoding. This would than assume to encode to utf8. |
You mean so you can write: Stream s = Stream.FromUtf8Text(str); instead of: Stream s = Stream.FromText(text, Encoding.UTF8); ? |
Yeah, I understand how ridiculous this sound, since it is a very simple Encoding.UTF8 what I am asking to leave out, and this is my thinking when I read about the BinaryData Proposal, but in my experience this has helped many coming from other languages. Maybe it should be just an overload instead of it's own method and also be called FromText and just assume if encoding isn't specified it is utf8 or defaulting encoding to utf8. |
I'm not quite certain about the naming. Especially since the difference between What about |
Maybe FromBytes? I think Read has a different meaning, to me it would imply that I am reading data or a text from the stream, but the From is used in the bcl for creating new instances, so it makes sense to stay with that pattern. I think it would make sense to have an extension coming from STJ on stream that creates a stream from an object and that would than be called FromData, although FromObject sounds more like it... |
I would suggest making the If the encoding is not specified, the methods would return the most performant and trivial stream wrapper (as the title suggest) instead of a more complex transcoder: public partial class Stream
{
public static Stream FromText(string text); // a StreamReader for the result can use Encoding.Unicode
public static Stream FromText(ReadOnlyMemory<char> text); // Encoding.Unicode
public static Stream FromText(byte[] text); // Encoding.Utf8
public static Stream FromText(ReadOnlyMemory<byte> text); // Encoding.Utf8
[...] // and the originally proposed overloads with the encoding parameter
} Feel free to disagree - I also see that in the SO topic (mentioned by @jozkee) my answer is far less popular than the accepted one, even though it's the only one, which is an actual wrapper. Of course, transcoders also can be implemented without reading the whole source in advance but even then they obviously have some overhead both in performance and allocation. |
I think there's no problem of extension method |
@jozkee, note this proposed API isn't possible unless |
😍 |
@svick the returned stream from Strem.FromData(byte[]) should be capable of writing, so conveying "Read" on the name is not possible.
@koszeggy our current stance is to force users to reason about encoding, see #46663 (comment). I will also add that it is easy to relax it in the future (by adding an overload or making the param nullable as you suggest), but is hard to go the opposite way.
@stephentoub ah, I haven't thought of that. This wouldn't be a problem for extension methods. So, how do we wanna go about this? What are the trade-offs of moving ReadOnlySequence to System.Private.CoreLib? I didn't find any GH issues discussing this in the past. I do know that we have moved other types before (e.g: FileStream). |
Is there a reason not to do this? |
I am not even sure what a null Encoding would mean. if it is in case of null defaulted to Unicode, shouldn't it instead than just be defaulted to Unicode? I understand, that this is out of scope for now, I am just wondering, if it would make sense at all to have it nullable and if what the semantic of a nullable encoding would mean here. as for FWIW: |
@davidfowl only consistency I think. I interpreted @stephentoub's comment on #27156 (comment) as an ask to bring a consistent solution between |
It's not extension methods vs not, it's "on Stream" or "on a sibling type". Extension methods would be forced to be on a new type; but it could just as easily be the same proposal, but as non-extension methods on a new type, which lives in a higher ring. public class StreamFactory
{
public static Stream FromText(string text, Encoding encoding);
public static Stream FromText(ReadOnlyMemory<char> text, Encoding encoding);
public static Stream FromData(byte[] data);
public static Stream FromData(ReadOnlyMemory<byte> data);
public static Stream FromData(ReadOnlySequence<byte> data);
} (The "From" name seems wrong there, but it could be massaged...) |
@bartonjs yes, I got that, I just thought extensions still had a chance over factories. I will update the proposal shortly. |
Then what is the difference between Also, what makes |
@svick that's my bad, I didn't notice @bartonjs proposal included I'm not sure how useful @bartonjs, objections? |
There's always the question of when we stop, but we've been leaving
Guidelines say not to have both a ReadOnlyMemory and Memory overload in the same method group (unless it's an extension method). |
But in this case, Memory and ReadOnlyMemory convey different things, the former will return a Stream that is writable and readable and the latter a readable only. Can't this be an exception to the rule? |
Well, when two overloads have different semantics the guidelines suggest they shouldn't be overloads, but different names (different method groups). I had thought this was only about read-only streams. If writing is desired, then perhaps it becomes something more like public class StreamFactory
{
public static Stream CreateReadOnly(string text, Encoding encoding);
public static Stream CreateReadOnly(ReadOnlyMemory<char> text, Encoding encoding);
public static Stream CreateReadOnly(byte[] data);
public static Stream CreateReadOnly(ReadOnlyMemory<byte> data);
public static Stream CreateReadOnly(ReadOnlySequence<byte> data);
public static Stream CreateWriteOnly(Memory<char> buffer, Encoding encoding);
public static Stream CreateWriteOnly(byte[] buffer);
public static Stream CreateWriteOnly(Memory<byte> buffer);
// Would these need knobs to control the read-pos and write-pos differently?
public static Stream CreateReadWrite(Memory<char> buffer, Encoding encoding);
public static Stream CreateReadWrite(byte[] buffer);
public static Stream CreateReadWrite(Memory<byte> buffer);
} |
I don't think so, all streams just rely in one Position. If you start Reading 10 bytes and then you Write 10 bytes, you end up on Position 20. The same should happen here. |
There were two main approaches discussed in review. For the first, having The second was to see if MemoryStream can be changed (without significant penalty) to work off of Memory/ReadOnlyMemory. If it can, then we'd move The second seemed to have a bit more favor, but requires further investigation, so classifying this as needs-work. // It will be in a new assembly with a dependency on System.Memory.dll and System.Runtime.dll
namespace System.IO
{
public partial class StreamFactory
{
public static Stream FromText(string text, Encoding encoding);
public static Stream FromText(Memory<char> text, Encoding encoding);
public static Stream FromText(ReadOnlyMemory<char> text, Encoding encoding);
public static Stream FromData(Memory<byte> data);
public static Stream FromData(ReadOnlyMemory<byte> data);
public static Stream FromData(ReadOnlySequence<byte> data);
}
} |
We're moving this out to Future and it will be considered again for .NET 9. |
Background and motivation
This is meant to bundle #46663, #27156, and #22838.
There's the usual need of wrapping memory and text-based types in a Stream and given that there are multiple implementations spread across different libraries, it would be ideal to have a standardized API for it directly on .NET.
A few examples of the current options available:
CommunityToolkit.HighPerformance provides
AsStream()
extension methods for the following:Memory<byte>
ReadOnlyMemory<byte>
IMemoryOwner<byte>
IBufferWriter<byte>
Nerdbank.Streams also provides
AsStream()
extensions for the following:IDuplexPipe
PipeReader/PipeWriter
WebSocket
ReadonlySequence<byte>
IBufferWriter<byte>
While I'm not certain of providing support for all the types listed above, I think a good starting point would be the following based on the evidence shown in the issues I bundled.
API Proposal
Originally proposed by @bartonjs on #82801 (comment).
API Usage
Example 1:
Example 2:
Alternative Design 1
From #82801 (comment).
Alternative Design 2
We can think of providing explicit types, e.g:
StringStream
, instead of simply returningStream
, but I like this way more since the Stream types would be internal and could make it more flexible and maintainable for future improvements e.g: we could start returning aMemoryStream
instead of an non-visible one if MemoryStream reaches the point where it finally gets aReadOnlyMemory<byte>
ctor.Alternative Design 3 (original proposal)
Follows prior-art of libraries listed above, we could provide
AsStream()
extensions.Based on comments in this thread pointing out that such extensions don't address a common-enough scenario I've moved this proposal to the alternative section.
Risks
No response
The text was updated successfully, but these errors were encountered: