Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leave overflow logic to go-msgio #1910

Closed
wants to merge 1 commit into from

Conversation

tilgovi
Copy link

@tilgovi tilgovi commented Oct 29, 2015

It seems as though go-msgio now implements the message length limit
(currently 8MB), so the logic in secio reader can be simplified.

This is my first go code. I could be wildly wrong about many things.

I tried to run make test and got failures, but there are no tests for this module, it seems, so I don't know if these are failures that I've caused.

@GitCop
Copy link

GitCop commented Oct 29, 2015

There were the following issues with your Pull Request

  • Commit: e698aa2
    • Invalid signoff. Commit message must end with
      License: MIT
      Signed-off-by: .* <.*>

Guidelines and a script are available to help. Your feedback on GitCop is welcome on this issue.


This message was auto-generated by https://gitcop.com

@jbenet jbenet added the backlog label Oct 29, 2015
@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

I'm able to run the tests locally. Things are failing. I'll see what's up.

@jbenet
Copy link
Member

jbenet commented Oct 29, 2015

I believe we still need to have a buffer here, for the ETM behavior to work correctly. (also will want to be able to tune it here, separately from having to change go-msgio)

@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

There still is a buffer here and there can be tuning added, but right now what's here is just redundant. The exact same check and error is thrown by go-msgio.

@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

Unless I misunderstand, but it looks like one message from the msgio is one payload+mac, and the buffer here is just because the reader might request less than the full payload and, regardless of the chunking and chunk digests, it seems like the reader is meant to receive a continuous stream.

@tilgovi tilgovi force-pushed the cleanup-secio-reader-overflow branch from e698aa2 to d6474c3 Compare October 29, 2015 22:35
@GitCop
Copy link

GitCop commented Oct 29, 2015

There were the following issues with your Pull Request

  • Commit: d6474c3
    • Invalid signoff. Commit message must end with
      License: MIT
      Signed-off-by: .* <.*>

Guidelines and a script are available to help. Your feedback on GitCop is welcome on this issue.


This message was auto-generated by https://gitcop.com

@tilgovi tilgovi force-pushed the cleanup-secio-reader-overflow branch from d6474c3 to 1c7e286 Compare October 29, 2015 22:38
func (r *etmReader) Read(buf []byte) (int, error) {
r.Lock()
defer r.Unlock()

// first, check if we have anything in the buffer
copied := r.drainBuf(buf)
buf = buf[copied:]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was meant to deal with the case where buf is smaller than r.buf but I think it should read buf = buf[:copied] (all the bytes up to copied, not all the bytes after).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no i believe this is right as is -- (this seems to be way more complicated than it should be, though).

  • buf here is the user passed in buf which will receive the data
  • r.drainBuf(buf) will copy into buf all the stuff in r.buf that fits into buf.
  • it also updates r.buf to be r.buf[n:] (i.e. already copied buf[0:n] into buf)
  • copied is the number of bytes copied into buf. this may be a number smaller than len(buf), and if so, we want to set buf to be buf[copied:] so that further copies into buf only fill up the remaining (unused) bytes.
  • below, we copy more into buf, and in that case we only write into the original buf[copied:], because that's what hasn't been written to.

all of this is useless anyway because exit if copied > 0, which seems to have been added later, as a short circuit for simplicity.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, wow this code. @jbenet needs to write cleaner crap.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for the explanation. This is me being a go newbie, despite being an experience C programmer. I was thinking that somehow buf was manipulating a reference and buf = buf[copied:] was changing the result that the caller sees.

@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

All fixed. Even if you don't want to remove this overflow logic, please see my comment in the changes: I think there is a bug.

@tilgovi tilgovi force-pushed the cleanup-secio-reader-overflow branch from 1c7e286 to 6c95889 Compare October 29, 2015 23:22
@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

Well, this is ready to be either closed or merged, I think. Your call :).

@jbenet
Copy link
Member

jbenet commented Oct 29, 2015

I really like the simplicity, and this seems right to me. (tests seem to pass too, so should be right).

the only thing i wonder about is whether the buffer should be held here or left to go-msgio. msgio may change again, not clear. and we may want to make this buffer configurable, not sure.

@whyrusleeping is going to be a much better person to CR this than me.

@jbenet
Copy link
Member

jbenet commented Oct 29, 2015

Oh, now i recall. the thing about using go-msgio Read() instead of ReadMsg() is that with Read() we keep one buffer allocated at all times (the etmReader's), per encrypted channel. instead, with ReadMsg() we allocate one buffer per new message (minus the pool thing, but im not convinced that thing works very well). the gc pressure is larger and thus may be lower perf. (right @whyrusleeping ?) and this is one of the few places where that actually matters

@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

Oh, I see. Since the caller is passing in a buffer, we can ask msgio to fill it with the io.ReadFull call. Otherwise, r.msg.ReadMsg() is going to allocate a buffer internally.

@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

The msgio.reader uses a buffer pool, though. :-/

@tilgovi
Copy link
Author

tilgovi commented Oct 29, 2015

Although, I never release the buffer back to the pool.

@jbenet if there's still desire to make all of this more streaming somehow, I'm happy to hack at that instead.

@whyrusleeping
Copy link
Member

wow, so i've never looked at this etm code before.

At first glance, this looks like the right idea to me. although one thing to note: you will need to keep track of the original buffer handed to you from msgio, and call r.ReleaseMsg on it when its empty, memory pressure here is important (and we apparently havent been releasing the messages here at all 👎 )

I'll play with this a little more in the morning after you make the ReleaseMsg change.

@jbenet
Copy link
Member

jbenet commented Oct 30, 2015

We haven't needed to release the messages because we used Read with a single buffer, no more allocations.


Sent from Mailbox

On Fri, Oct 30, 2015 at 3:49 AM, Jeromy Johnson notifications@github.com
wrote:

wow, so i've never looked at this etm code before.
At first glance, this looks like the right idea to me. although one thing to note: you will need to keep track of the original buffer handed to you from msgio, and call r.ReleaseMsg on it when its empty, memory pressure here is important (and we apparently havent been releasing the messages here at all 👎 )

I'll play with this a little more in the morning after you make the ReleaseMsg change.

Reply to this email directly or view it on GitHub:
#1910 (comment)

@whyrusleeping
Copy link
Member

@jbenet oooooh, right.

@tilgovi
Copy link
Author

tilgovi commented Oct 30, 2015

Thanks, you two. Closing this. Good first exercise, though :).

@tilgovi tilgovi closed this Oct 30, 2015
@jbenet jbenet removed the backlog label Oct 30, 2015
@tilgovi tilgovi deleted the cleanup-secio-reader-overflow branch October 30, 2015 17:54
@whyrusleeping
Copy link
Member

@tilgovi i think theres still something there worth pursuing in this PR. your PR would lower memory pressure (as long as you added the release call), and simplify this part of the code quite a bit.

@tilgovi tilgovi restored the cleanup-secio-reader-overflow branch October 30, 2015 18:06
@tilgovi tilgovi reopened this Oct 30, 2015
@jbenet jbenet added the backlog label Oct 30, 2015
@tilgovi
Copy link
Author

tilgovi commented Oct 30, 2015

Re-opened. Definitely agree it simplifies, but I don't see how it alleviates any memory pressure. As @jbenet pointed out, the user is passing in a buffer here. Although, in the case that the user buffer is too small for the message, we're allocating a new buffer here anyway. That would be better to do in msgio since it's got a buffer pool.

If you think it's helpful, I'll add the release back to the pool. Anything else you would like me to change?

@jbenet
Copy link
Member

jbenet commented Oct 30, 2015

@whyrusleeping i believe the old way is much better for memory. It's a larger constant allocation, but it doesn't cause a bunch of sequential allocs of varying size which will undoubtedly lead to fragmentation. (worse mem and cpu utilization).

Anyway, the simplification and cleanup is worth pursuing

@tilgovi
Copy link
Author

tilgovi commented Oct 30, 2015

I'll just let it bake in my head over the weekend and see if there's anything I think is worthwhile on Monday.

@whyrusleeping
Copy link
Member

@jbenet the memory pools are rather efficient. they work quite well

@tilgovi tilgovi force-pushed the cleanup-secio-reader-overflow branch 3 times, most recently from 51808c7 to 42697e6 Compare November 2, 2015 23:31
@tilgovi
Copy link
Author

tilgovi commented Nov 3, 2015

I've rewritten this but I'm sure I have a bug somewhere because the tests are failing.

The current implementation aims to read directly into the output buffer when it is sufficiently large and to let msgio allocate from its pool otherwise, draining from and releasing that when it's been used up.

This should be the best of both worlds: zero-copy if the output buffer is large enough and buffer pool otherwise.

@tilgovi
Copy link
Author

tilgovi commented Nov 3, 2015

I'll let you know when I work out the kinks if no one spots it before I do.

@tilgovi tilgovi force-pushed the cleanup-secio-reader-overflow branch from 42697e6 to f9684c6 Compare November 4, 2015 21:58
@tilgovi
Copy link
Author

tilgovi commented Nov 4, 2015

There you go.

@jbenet
Copy link
Member

jbenet commented Nov 10, 2015

this LGTM. thanks @tilgovi

@whyrusleeping CR?

@ghost ghost added topic/encryption Topic encryption RFCR labels Dec 22, 2015
@daviddias daviddias removed the backlog label Jan 2, 2016
@whyrusleeping
Copy link
Member

@tilgovi sorry for taking so long on this one, could you rebase it on latest master?

go-msgio implements the message length limit (currently 8MB), so the
secio reader can get away without re-implementing this logic.

Instead, the reader lets go-msgio allocate a buffer when the output
buffer is too small to hold the incoming message. This buffer is kept
and drained into the output buffer(s) of Read() calls and released back
to the msgio instance once it is fully drained.

License: MIT
Signed-off-by: Randall Leeds <randall@bleeds.info>
@tilgovi tilgovi force-pushed the cleanup-secio-reader-overflow branch from f9684c6 to 6eedb6c Compare January 29, 2016 17:52
@tilgovi
Copy link
Author

tilgovi commented Jan 29, 2016

Done. Not all the CI passed, but I can't see how it's related to the change and they may be spurious. I don't know the state of the test suite.

@whyrusleeping
Copy link
Member

Yeah, the test failures arent related. The one problem we have now right now is #2256 all this code is being moved out of the repo into libp2p. I can handle moving this commit over there if you'd like

@tilgovi
Copy link
Author

tilgovi commented Jan 30, 2016

Moved to libp2p/go-libp2p#13

@tilgovi tilgovi closed this Jan 30, 2016
@tilgovi tilgovi deleted the cleanup-secio-reader-overflow branch January 30, 2016 01:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/encryption Topic encryption
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants