Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm/network: measure time of messages in priority queue #19250

Merged
merged 18 commits into from
Mar 20, 2019

Conversation

nonsense
Copy link
Member

@nonsense nonsense commented Mar 11, 2019

This PR is:

  1. Updating our vendored OpenTracing Go library.
  2. Adding an inputSeed flag to the smoke tests, so that we can reproduce an exact upload to a given Swarm deployment.
  3. Executing trackChunks both on successful and on failed smoke test runs, so that we can compare between them.
  4. Reducing amount of Debug and Trace logs in some packages.
  5. Adding a timeout to the InfluxDB HTTP client, so that we don't wait indefinitely in case of a blocking InfluxDB metrics report call.
  6. Adding a timer around the f() (generally a p2p.Send() call) in our priority queue.
  7. Making the stream API private.

@nonsense nonsense added this to the 1.9.0 milestone Mar 11, 2019
@nonsense nonsense requested review from janos and skylenet March 11, 2019 13:23
@@ -154,7 +154,7 @@ func (p *Peer) Deliver(ctx context.Context, chunk storage.Chunk, priority uint8,
}

ctx = context.WithValue(ctx, "stream_send_tag", nil)
return p.SendPriority(ctx, msg, priority)
return p.Send(ctx, msg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change it here?
either leave it or change everywhere by temporarily redefining sendPriority

Copy link
Member Author

@nonsense nonsense Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change it so that we can use tracing to debug this problem. SendPriority doesn't support traces.

Furthermore I posted measurements of the priority queue in the chat yesterday, which show that messages could stay in the priority queue for tens-of-seconds, which is really bad:

Screen Shot 2019-03-11 at 14 31 58

Bottom line, I try to reduce the number of packages that behave weird in order to understand the timeout issue.

Smoke tests have been passing for the past 16 hours consistently with this PR, which means that using Send here is just fine at least according to our current test suite.

@zelig do we have any actual test that show the we must use a priority queue for send RPC calls? Generally all our RPC handlers are very fast and the handleIncoming loop is really tight, so the priority queue doesn't make much sense to me.

Note that when the messages stayed for that long in the queue, there were no error messages logged that suggest queue contention.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good job @nonsense could you try setting the syncer streamer priority from top to mid to understand what was happening

@nonsense nonsense force-pushed the pq-measurements branch 4 times, most recently from 65adf5a to 3b1b81b Compare March 18, 2019 10:25
@nonsense nonsense requested a review from janos March 20, 2019 13:42
Copy link
Contributor

@zelig zelig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one minor comment

err := trackChunks(randomBytes[:])
if err != nil {
log.Error(err.Error())
// trigger debug functionality on randomBytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to trigger this even if err is nil?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, so that we compare success runs vs failed runs.

@nonsense nonsense merged commit baded64 into ethereum:master Mar 20, 2019
@acud acud deleted the pq-measurements branch March 21, 2019 09:51
kiku-jw pushed a commit to kiku-jw/go-ethereum that referenced this pull request Mar 29, 2019
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Dec 10, 2024
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Dec 10, 2024
gzliudan added a commit to gzliudan/XDPoSChain that referenced this pull request Dec 13, 2024
JukLee0ira pushed a commit to JukLee0ira/XDPoSChain that referenced this pull request Dec 16, 2024
JukLee0ira pushed a commit to JukLee0ira/XDPoSChain that referenced this pull request Dec 16, 2024
JukLee0ira pushed a commit to JukLee0ira/XDPoSChain that referenced this pull request Dec 20, 2024
JukLee0ira pushed a commit to JukLee0ira/XDPoSChain that referenced this pull request Dec 20, 2024
JukLee0ira pushed a commit to JukLee0ira/XDPoSChain that referenced this pull request Dec 22, 2024
JukLee0ira pushed a commit to JukLee0ira/XDPoSChain that referenced this pull request Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants