Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

swarm: add a basic metrics tracer #1973

Merged
merged 6 commits into from
Jan 27, 2023
Merged

swarm: add a basic metrics tracer #1973

merged 6 commits into from
Jan 27, 2023

Conversation

marten-seemann
Copy link
Contributor

@marten-seemann marten-seemann commented Jan 2, 2023

Fixes #1910.

Due to our experience with OpenCensus (#1955), this now uses Prometheus. There's a benchmark test that confirms that (by using a sync.Pool for the slices containing the labels) we're not allocating at all.

@marten-seemann marten-seemann force-pushed the circuitv2-transport-connstate branch 2 times, most recently from 3328610 to 52034ff Compare January 7, 2023 03:03
@p-shahi p-shahi mentioned this pull request Jan 9, 2023
35 tasks
@marten-seemann
Copy link
Contributor Author

Here's some preliminary results, obtaining from running a Kubo node with this branch:
image

@marten-seemann marten-seemann changed the base branch from circuitv2-transport-connstate to master January 22, 2023 05:55
@marten-seemann
Copy link
Contributor Author

I added the Grafana dashboard. There's probably a lot that can be improved on that front, happy about suggestions (and PRs...).

What's really annoying here is that Grafana is too stupid to apply consistent colors on different panels. For example, it makes it soooo much easier to consume if QUIC always has the same color. You can set colors manually via so-called overrides, but those are per panel. So what I did is I defined the overrides for all our transports in one panel, and then copy-pasted a long block of JSON over to the other panels. Apparently that's the only way to do it. 🤢

@MarcoPolo
Copy link
Collaborator

I know it's asking a bit, but is there an open grafana dashboard I can see?

Copy link
Collaborator

@MarcoPolo MarcoPolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so cool! I can't believe we haven't had this yet.

Just some nits. I'll approve as soon as they're fixed.

)

func BenchmarkMetricsConnOpen(b *testing.B) {
b.ReportAllocs()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check that this is indeed some low number of allocs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a unit test? Benchmarks are currently not run on CI at all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup

if s.metricsTracer != nil {
connState := connC.ConnState()
s.metricsTracer.OpenedConnection(network.DirOutbound, connC.RemotePublicKey(), connState)
s.metricsTracer.CompletedHandshake(time.Since(start), connState)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this on the listen side as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would require us to put it in the transport. The swarm doesn't know when the handshake of an incoming connection started.

p2p/net/swarm/swarm_metrics.go Outdated Show resolved Hide resolved
tags := getStringSlice()
defer putStringSlice(tags)
*tags = appendConnectionState(*tags, cs)
connHandshakeLatency.WithLabelValues(*tags...).Observe(t.Seconds())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be great if we could know if this was an early muxer negotiation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add this to the ConnectionState struct?

@MarcoPolo MarcoPolo merged commit 3919359 into master Jan 27, 2023
@MarcoPolo
Copy link
Collaborator

Merging because I can't wait for this feature!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

swarm: minimal set of metrics
2 participants