-
-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use net.Buffers in Go 1.8 for encoding. #70
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking at this! Can you also add yourself to the AUTHORS+CONTRIBUTORS files?
mem.go
Outdated
if err := e.write(e.hdrbuf); err != nil { | ||
return err | ||
} | ||
bufs := make([][]byte, 1+nsegs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this list of bufs could be reused between calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mem_18.go
Outdated
"zombiezen.com/go/capnproto2/internal/packed" | ||
) | ||
|
||
func (e *Encoder) write(bufs net.Buffers) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keep the function parameter type the same between 1.8 and non-1.8.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mem_other.go
Outdated
|
||
func (e *Encoder) write(bufs [][]byte) error { | ||
for _, b := range bufs { | ||
if e.packed { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull out the e.packed
logic into a separate (non-tagged) function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mem.go
Outdated
if err := e.write(e.hdrbuf); err != nil { | ||
return err | ||
} | ||
bufs := make([][]byte, 1+nsegs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GH is not letting me add a comment farther up... while you're visiting this code, it seems like traversing the segments twice (using segmentSizes
) isn't really necessary. Mind rolling that all into one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mem.go
Outdated
if uint64(cap(e.hdrbuf)) < hdrSize { | ||
e.hdrbuf = make([]byte, hdrSize) | ||
if int64(cap(e.bufs)) < 1+nsegs { | ||
e.bufs = make([][]byte, 1+nsegs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's let append
deal with the resizing. Instead of these two ifs, let's have:
e.sizes = e.sizes[:0]
e.bufs = append(e.bufs[:0], nil) // first element is placeholder for header
(although see below, because I don't think we need the sizes
buffer.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
@@ -571,40 +573,51 @@ func (e *Encoder) Encode(m *Message) error { | |||
if nsegs == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove the TODO above, I don't think it will ever happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
} | ||
for i := int64(0); i < nsegs; i++ { | ||
s, err := m.Segment(SegmentID(i)) | ||
if err != nil { | ||
return err | ||
} | ||
if err := e.write(s.data); err != nil { | ||
return err | ||
n := len(s.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: make this n := Size(len(s.data))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would risk an overflow that the int64(n) > int64(maxSize)
is designed to avoid. Size is a uin32
but len(x)
returns an int
which can easily be larger than the maximum value of a uint32
.
mem.go
Outdated
} | ||
e.sizes[i] = Size(n) | ||
e.bufs[1+i] = s.data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of the above comment, e.bufs = append(e.bufs, s.data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mem.go
Outdated
} else { | ||
e.hdrbuf = e.hdrbuf[:hdrSize] | ||
} | ||
marshalStreamHeader(e.hdrbuf, e.sizes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, I think I want to nuke marshalStreamHeader
. It's adding an allocation for not much clarity. Let's inline it here (and define a new helper, appendUint32
):
if uint64(cap(e.hdrbuf)) < hdrSize {
e.hdrbuf = make([]byte, 0, hdrSize)
}
e.hdrbuf = appendUint32(e.hdrbuf[:0], maxSeg)
for i, buf := range e.bufs {
e.hdrbuf = appendUint32(e.hdrbuf, uint32(Size(len(buf))/wordSize))
}
if len(e.hdrbuf) % int(wordSize) != 0 {
e.hdrbuf = appendUint32(e.hdrbuf, 0)
}
Interestingly, I think that the current Encoder
has a padding bug in this regard. Can you add a test where a 3-segment message is encoded then a 2-segment message is encoded, and then assert that the 2-segment message's header properly contains zero-padding (instead of garbage)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been able to reproduce that bug and have added a failing test case for it. I'll inline marshalStreamHeader
and fix the bug now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've only touched the (*Encoder).Encode
part so far and I've left the use of marshalStreamHeader
in (*Message).Marshal
. So the bug still exists in (*Message).Marshal
.
I'm not sure how you want me to handle that, but I'm thinking it would be best if (*Message).Marshal
and (*Message).MarshalPacked
were just made wrappers around (*Encoder).Encode
. That would mean the encoding logic only exists once and would reduce duplication. I think I'll go ahead and add a commit for that and you can decide whether you like it or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does the bug exist in Marshal? The issue is that the buffer gets reused and there's no buffer to reuse for Marshal.
IIRC, having different codepaths does have a big performance impact (i.e. Marshal is faster when it doesn't go through a Writer). I would rather keep the separation unless you can prove that this doesn't significantly alter performance.
I realize this is a bit bigger of a change than you probably wanted to do. Let me know if you want me to pick up this part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I backed that change out.
I think I misread you're desire to 'nuke' marshalStreamHeader
as a request for me to eliminate it. I'll leave removing marshalStreamHeader
up to you as a separate matter.
import "net" | ||
|
||
func (e *Encoder) write(bufs [][]byte) error { | ||
_, err := (*net.Buffers)(&bufs).WriteTo(e.w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for pointer indirection, just convert to net.Buffers
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried that before but (*net.Buffers).WriteTo
is a pointer receiver.
# zombiezen.com/go/capnproto2
./mem_18.go:8: cannot call pointer method on net.Buffers(bufs)
./mem_18.go:8: cannot take the address of net.Buffers(bufs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How bizarre. Opened golang/go#19680
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a result of how net.Buffers
is implemented so it's very intentional. net.Buffers
needs to work with multiple calls to Read
and WriteTo
(for things like io.Copy
), to do this it needs to be able to modify the slice as it consumes more data - see src/net/net.go.
marshalStreamHeader fails to zero the padding bytes after the last segment size. This is a problem because e.hdrbuf is reused between calls to Encode. This leaves garbage in the header in certain circumstances. This will be fixed in a future commit. (See the discussion of #70 for more details of the bug).
This also fixes TestStreamHeaderPadding introduced in 3f1f84d. This still leaves marshalStreamHeader to be removed from (*Message).Marshal.
This removes all uses of marshalStreamHeader and (*Message).segmentSizes, both of which are removed.
mem_test.go
Outdated
@@ -708,6 +708,47 @@ func TestDecoder_MaxMessageSize(t *testing.T) { | |||
} | |||
} | |||
|
|||
func TestStreamHeaderPadding(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment noting that this is a regression check for header buffer padding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
This pull request covers issue #68.
On go1.7 and below it behaves as before. On go1.8 with the packed encoder it also behaves as before. On go1.8 with the regular encoder it uses
net.Buffers
which will usewritev
in a single system call where possible.It adds go1.8 to travis to test the new code path.