Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Instrumentation] Opentracing on P2P Messages #438

Closed
3 tasks done
prestonvanloon opened this issue Aug 26, 2018 · 20 comments
Closed
3 tasks done

[Instrumentation] Opentracing on P2P Messages #438

prestonvanloon opened this issue Aug 26, 2018 · 20 comments
Labels
Enhancement New feature or request
Milestone

Comments

@prestonvanloon
Copy link
Member

prestonvanloon commented Aug 26, 2018

Similar to #437, this task is to add tracing to Prysm, starting with P2P messages.

https://github.com/opentracing/opentracing-go
https://www.jaegertracing.io/

The ideal use case is to trace a P2P request from start to finish to determine where bottlenecks might be. Imagine that a request for a block body took 4 seconds from receiving the request to the reply. A tracing framework would be able to help us understand where the slowdown might be.

Some of the requirements to close this issue:

  • Brief design as a comment in this issue.
  • An easily reusable p2p middleware adapter to trace incoming and outgoing p2p messages
  • Add flag to disable tracing --disable-tracing.

If there are more requirements discovered in the design, please add them to the issue in a comment or editing this description.

@prestonvanloon prestonvanloon added this to the Sapphire milestone Aug 26, 2018
@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


This issue now has a funding of 250.0 DAI (250.0 USD @ $1.0/DAI) attached to it.

@bakaoh
Copy link

bakaoh commented Sep 5, 2018

Brief design:

@mkosowsk
Copy link

mkosowsk commented Sep 5, 2018

Hi @d1vyank, do you have a brief write-up of your plan of attack for this issue?

@vs77bb
Copy link

vs77bb commented Sep 5, 2018

@bakaoh Are you planning on applying for this one, as well? Please do so here, if so!

@bakaoh
Copy link

bakaoh commented Sep 6, 2018

@vs77bb -- look like @d1vyank already discussed the details while i didn't quite sure i understand it right

@d1vyank
Copy link
Contributor

d1vyank commented Sep 6, 2018

Hi @mkosowsk, I'm still working out the details but here is the approach I've mapped out so far:

  • After a brief discussion with @prestonvanloon on Discord I propose that we use OpenCensus+Jaeger for trace collection in place of OpenTracing+Jaeger. OpenCensus provides support for Go 1.11's execution tracer which gives us better insight in Go's runtime events (GC, scheduling, etc) and how they impact our requests.
  • Implement execution tracing within a node for a breakdown on how a request's processing time is spent (for example, the effect of an expensive crypto operation or DB access). This will involve defining regions and tasks around these operations in the code base.
  • Implement distributed tracing for p2p messages. This trace would show us the flow of messages between nodes within the system — I'm still investigating how to achieve this with libp2p as the transport. These traces could be hard to read for things like floodsub and incomplete as we can only obtain them from peers that have tracing enabled and our Jaeger instance can reach. I will try to use the adapter introduced in Split P2P Topics And Introduce Middleware (Adapters) #421 to implement this.
  • Integrate with Jaeger for visualization, document, and provide examples.

@mkosowsk
Copy link

mkosowsk commented Sep 6, 2018

@d1vyank looks like a solid approach to me! @vs77bb please approve at your earliest convenience :)

@spm32
Copy link

spm32 commented Sep 7, 2018

@mkosowsk Agreed, thanks for the detailed outline @d1vyank, approving you now! :)

@prestonvanloon
Copy link
Member Author

@d1vyank I merged support for go 1.11 in #490 so you should be unblocked :)

@d1vyank
Copy link
Contributor

d1vyank commented Sep 11, 2018

Thank you! Quick update: I've been reading up on the Ethereum 2.0 architecture to understand the types of p2p messages and their flows so that I can trace them.

@d1vyank
Copy link
Contributor

d1vyank commented Sep 14, 2018

Update: I've put up a PR for the first part (execution tracing), reviews appreciated!

The second part (distributed tracing) requires a bit of design. I've been looking into the libp2p spec and couldn't find any provision for tracing related information to be transported at the protocol level. libp2p simply offers data streams between peers without defining what flows through them, unlike gRPC which allows metadata to be attached to request headers. I propose that we define a metadata field in the p2p message protobuf at the application layer to allow us to propagate traces between nodes.

@d1vyank
Copy link
Contributor

d1vyank commented Sep 21, 2018

Update: Execution tracing is merged in. Distributed tracing is blocked on #528. To propagate traces between nodes we need to attach trace information to p2p messages. However, the wire format of these messages has not been defined yet.

I'd be happy to look at this again once this is decided, if I have availability at the time.

@mkosowsk
Copy link

@prestonvanloon do you feel that @d1vyank's efforts warrant a payout of this bounty or is distributed tracing required? (looks like this is currently blocked by #528). Thanks! 👍

@prestonvanloon
Copy link
Member Author

@d1vyank Let's visit that issue at a later date. We're still deciding on the wire protocol for the future.

This is actually a topic for discussion tomorrow: ethereum/eth2.0-pm#8

@mkosowsk, I think this is good to go for the original request. Sorry about the delay!

@mkosowsk
Copy link

@prestonvanloon thanks! @ceresstation please pay out at your earliest convenience :)

@spm32
Copy link

spm32 commented Sep 27, 2018

@d1vyank can you start / submit work on the issue again? It currently shows that you've stopped work. I'll pay you out as soon as that's done :)

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Workers have applied to start work.

These users each claimed they can complete the work by 2 months, 1 week from now.
Please review their action plans below:

1) d1vyank has applied to start work (Funders only: approve worker | reject worker).

Restarting work to collect bounty for execution tracing.

Learn more on the Gitcoin Issue Details page.

@gitcoinbot
Copy link

gitcoinbot commented Sep 27, 2018

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work has been started.

These users each claimed they can complete the work by 2 months from now.
Please review their action plans below:

1) d1vyank has been approved to start work.

Restarting work to collect bounty for execution tracing.

Learn more on the Gitcoin Issue Details page.

@spm32
Copy link

spm32 commented Sep 28, 2018

@d1vyank Just need you to submit work and then I can pay you out here :)

@gitcoinbot
Copy link

Issue Status: 1. Open 2. Started 3. Submitted 4. Done


Work for 250.0 DAI (250.0 USD @ $1.0/DAI) has been submitted by:

  1. @d1vyank

@ceresstation please take a look at the submitted work:


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants