Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use net.Buffers in Go 1.8 for encoding. #70

Merged
merged 14 commits into from
Mar 29, 2017
Merged

Use net.Buffers in Go 1.8 for encoding. #70

merged 14 commits into from
Mar 29, 2017

Conversation

tmthrgd
Copy link
Contributor

@tmthrgd tmthrgd commented Mar 17, 2017

This pull request covers issue #68.

On go1.7 and below it behaves as before. On go1.8 with the packed encoder it also behaves as before. On go1.8 with the regular encoder it uses net.Buffers which will use writev in a single system call where possible.

It adds go1.8 to travis to test the new code path.

Copy link
Contributor

@zombiezen zombiezen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this! Can you also add yourself to the AUTHORS+CONTRIBUTORS files?

mem.go Outdated
if err := e.write(e.hdrbuf); err != nil {
return err
}
bufs := make([][]byte, 1+nsegs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this list of bufs could be reused between calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mem_18.go Outdated
"zombiezen.com/go/capnproto2/internal/packed"
)

func (e *Encoder) write(bufs net.Buffers) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep the function parameter type the same between 1.8 and non-1.8.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mem_other.go Outdated

func (e *Encoder) write(bufs [][]byte) error {
for _, b := range bufs {
if e.packed {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull out the e.packed logic into a separate (non-tagged) function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mem.go Outdated
if err := e.write(e.hdrbuf); err != nil {
return err
}
bufs := make([][]byte, 1+nsegs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH is not letting me add a comment farther up... while you're visiting this code, it seems like traversing the segments twice (using segmentSizes) isn't really necessary. Mind rolling that all into one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mem.go Outdated
if uint64(cap(e.hdrbuf)) < hdrSize {
e.hdrbuf = make([]byte, hdrSize)
if int64(cap(e.bufs)) < 1+nsegs {
e.bufs = make([][]byte, 1+nsegs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's let append deal with the resizing. Instead of these two ifs, let's have:

e.sizes = e.sizes[:0]
e.bufs = append(e.bufs[:0], nil) // first element is placeholder for header

(although see below, because I don't think we need the sizes buffer.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -571,40 +573,51 @@ func (e *Encoder) Encode(m *Message) error {
if nsegs == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove the TODO above, I don't think it will ever happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

}
for i := int64(0); i < nsegs; i++ {
s, err := m.Segment(SegmentID(i))
if err != nil {
return err
}
if err := e.write(s.data); err != nil {
return err
n := len(s.data)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: make this n := Size(len(s.data))

Copy link
Contributor Author

@tmthrgd tmthrgd Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would risk an overflow that the int64(n) > int64(maxSize) is designed to avoid. Size is a uin32 but len(x) returns an int which can easily be larger than the maximum value of a uint32.

mem.go Outdated
}
e.sizes[i] = Size(n)
e.bufs[1+i] = s.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As part of the above comment, e.bufs = append(e.bufs, s.data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

mem.go Outdated
} else {
e.hdrbuf = e.hdrbuf[:hdrSize]
}
marshalStreamHeader(e.hdrbuf, e.sizes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I think I want to nuke marshalStreamHeader. It's adding an allocation for not much clarity. Let's inline it here (and define a new helper, appendUint32):

if uint64(cap(e.hdrbuf)) < hdrSize {
  e.hdrbuf = make([]byte, 0, hdrSize)
}
e.hdrbuf = appendUint32(e.hdrbuf[:0], maxSeg)
for i, buf := range e.bufs {
 e.hdrbuf = appendUint32(e.hdrbuf, uint32(Size(len(buf))/wordSize))
}
if len(e.hdrbuf) % int(wordSize) != 0 {
  e.hdrbuf = appendUint32(e.hdrbuf, 0)
}

Interestingly, I think that the current Encoder has a padding bug in this regard. Can you add a test where a 3-segment message is encoded then a 2-segment message is encoded, and then assert that the 2-segment message's header properly contains zero-padding (instead of garbage)?

Copy link
Contributor Author

@tmthrgd tmthrgd Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been able to reproduce that bug and have added a failing test case for it. I'll inline marshalStreamHeader and fix the bug now.

Copy link
Contributor Author

@tmthrgd tmthrgd Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only touched the (*Encoder).Encode part so far and I've left the use of marshalStreamHeader in (*Message).Marshal. So the bug still exists in (*Message).Marshal.

I'm not sure how you want me to handle that, but I'm thinking it would be best if (*Message).Marshal and (*Message).MarshalPacked were just made wrappers around (*Encoder).Encode. That would mean the encoding logic only exists once and would reduce duplication. I think I'll go ahead and add a commit for that and you can decide whether you like it or not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the bug exist in Marshal? The issue is that the buffer gets reused and there's no buffer to reuse for Marshal.

IIRC, having different codepaths does have a big performance impact (i.e. Marshal is faster when it doesn't go through a Writer). I would rather keep the separation unless you can prove that this doesn't significantly alter performance.

I realize this is a bit bigger of a change than you probably wanted to do. Let me know if you want me to pick up this part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I backed that change out.

I think I misread you're desire to 'nuke' marshalStreamHeader as a request for me to eliminate it. I'll leave removing marshalStreamHeader up to you as a separate matter.

import "net"

func (e *Encoder) write(bufs [][]byte) error {
_, err := (*net.Buffers)(&bufs).WriteTo(e.w)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for pointer indirection, just convert to net.Buffers.

Copy link
Contributor Author

@tmthrgd tmthrgd Mar 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that before but (*net.Buffers).WriteTo is a pointer receiver.

# zombiezen.com/go/capnproto2
./mem_18.go:8: cannot call pointer method on net.Buffers(bufs)
./mem_18.go:8: cannot take the address of net.Buffers(bufs)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How bizarre. Opened golang/go#19680

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a result of how net.Buffers is implemented so it's very intentional. net.Buffers needs to work with multiple calls to Read and WriteTo (for things like io.Copy), to do this it needs to be able to modify the slice as it consumes more data - see src/net/net.go.

marshalStreamHeader fails to zero the padding bytes after
the last segment size. This is a problem because e.hdrbuf
is reused between calls to Encode. This leaves garbage in
the header in certain circumstances.

This will be fixed in a future commit.

(See the discussion of #70 for more details of the bug).
This also fixes TestStreamHeaderPadding introduced in 3f1f84d.

This still leaves marshalStreamHeader to be removed from
(*Message).Marshal.
This removes all uses of marshalStreamHeader and
(*Message).segmentSizes, both of which are removed.
mem_test.go Outdated
@@ -708,6 +708,47 @@ func TestDecoder_MaxMessageSize(t *testing.T) {
}
}

func TestStreamHeaderPadding(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment noting that this is a regression check for header buffer padding.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@zombiezen zombiezen merged commit 69ac692 into capnproto:master Mar 29, 2017
@tmthrgd tmthrgd deleted the net.Buffers branch March 29, 2017 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants