-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proto: add streaming APIs for Unmarshal/Marshal #507
Comments
Hi,
It is certainly possible to make the In order to achieve better performance, |
@dsnet: I'm curious about your reasoning here - if I understand you correctly, the performance you're talking about concerns small (kilobytes) protobuf messages where the number of I'm thinking that for applications that handle larger protobuf messages, API support for If we could stream data from the file in smaller chunks, we would only need at most Perhaps large protobuf messages on file isn't a very common use-case, and that's why this API doesn't exist/isn't asked for? Another option is that I have misunderstood something, and that's why it doesn't exist 🙂 |
And right after I wrote the above I found https://developers.google.com/protocol-buffers/docs/techniques (read the section "Large data sets"), which states:
I guess |
An unserialized proto message uses memory proportional to the encoded wire data. Thus, even if the input was an
Let's suppose you have stream of proto messages of type message FooList {
repeated Foo = 1;
} When encoding, this would be written as: var ms []*foopb.Foo = ...
var b bytes.Buffer // or some other io.Writer
for _, m := range ms {
b.WriteByte(1<<3 | 2) // write a tag of field 1 of type bytes
b2 := proto.Marshal(m)
var b3 [binary.MaxVarintLen64]byte
b.Write(b3[:binary.PutUvarint(b3[:], uint64(len(b2)))]) // write the length prefix
b.Write(b2) // write the message data itself
} Encoding in this form as the benefit that See https://developers.google.com/protocol-buffers/docs/encoding. |
@dsnet: Thanks for taking the time to answer! Regarding memory allocation, I understand protobuf needs to read everything into memory (after all, it deserializes into structs). I was only reasoning about the intermediate memory used when deserializing a huge message - we first need to read the full message into
This would mean we wouldn't need to first allocate a Thanks for the example regarding |
what about reading a stream and knowing when to stop for the unmarshalling to work? |
The protobuf binary encoding doesn't include any end-of-record marker, so any framing needs to be added externally. That is, there's no way for the proto decoder to know when to stop. You need to tell it. |
alright that's what I figured, does grpc prepends a length field or something? |
I'm not personally familiar with the gRPC protocol, but probably. |
indeed. Thanks! |
I think this should be reopened. It's a fairly straightforward use-case and would improve usability substantially. |
See #912. This is a thing I'd like to do, but getting the design right is tricky and it's a substantial amount of work. Edit: is tricky, getting the design right is tricky. |
Ah, yeah, if it's still under consideration that's good enough for me. This
was just the bug that showed up from a search.
…On Mon, Mar 16, 2020, 13:51 Damien Neil ***@***.***> wrote:
I think this should be reopened. It's a fairly straightforward use-case
and would improve usability substantially.
See #912 <#912>.
This is a thing I'd like to do, but getting the design right isn't tricky
and it's a substantial amount of work.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#507 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJR7Z5DPYUBZTGG4KLGMQTRH2GOJANCNFSM4EO5OJXQ>
.
|
Hi, I'm not a Go expert so please excuse me if there is another way to acomplish my objective.
I'm looking for a way to use Unmarshall method mainly, having the io.Reader type as an input. Especifically the io.ReadCloser wrapped in the Response type of the net/http library.
My problem is that the current Unmarshall method of the proto library only receive bytes, and I have to use the ioutils.ReadAll(...) method to get them wich allocates innecesary memory; this happens because there is no way to know the body content length for sure. Even if I get the content length using the net/http, the API warns me about potentially runtime error.
I already know that out there are http libraries based on Slices that avoid the memory allocation, like fasthttp, but they don't implement the net/http standard, so I can't use them.
Is there a way that the proto library implement an Unmarshall/Marshall method that receive an io.Reader instead of bytes and having the same, or better, performance than the encoding/json or encoding/xml Unmarshall libraries?
Thanks for your patient and time.
The text was updated successfully, but these errors were encountered: