-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Less write lock contention, better read timeout handling #97
Conversation
2010bd6
to
07ab412
Compare
00105b3
to
ae30cc0
Compare
for n := range ch { | ||
fmt.Println("received") | ||
if n != prevN+1 { | ||
panic("bad order") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider use testing.T methods to fail from this goroutine instead of panicing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically shouldn't use those in a goroutine that isn't running the test, panic was good enough
// json.NewDecoder(r).Decode would read the whole frame as well, so might as well do it | ||
// with ReadAll which should be much faster | ||
// use a autoResetReader in case the read takes a long time | ||
buf, err := io.ReadAll(c.autoResetReader(r)) // todo buffer pool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ready to merge without buffer pool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's not really the point of the PR, and can be done separately
} | ||
|
||
// got the whole frame, can start reading the next one in background | ||
go c.nextMessage() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that we are starting goroutines only to kill them and start again bothers me a bit, but it shouldn't matter apart from making goroutine numbers annoying to deal with.
func (r *deadlineResetReader) Read(p []byte) (n int, err error) { | ||
n, err = r.r.Read(p) | ||
if time.Since(r.lastReset) > onReadDeadlineResetInterval { | ||
log.Warnw("slow/large read, resetting deadline while reading the frame", "since", time.Since(r.lastReset), "n", n, "err", err, "p", len(p)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Warn? It isn't actionable by the user and will happen during any request which requires more than 5s to send.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Arguably whenever any RPC takes that long to transfer, something is broken - so the user should either open an issue, or investigate their networking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
locking looks correct
This PR aims to improve/fix filecoin-project/lotus#8362