Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Desired Mainnet Merge Metrics Tracking #11367

Closed
36 tasks done
rauljordan opened this issue Aug 30, 2022 · 1 comment · Fixed by #11386
Closed
36 tasks done

Desired Mainnet Merge Metrics Tracking #11367

rauljordan opened this issue Aug 30, 2022 · 1 comment · Fixed by #11386
Assignees
Labels
Merge PRs related to the great milestone the merge

Comments

@rauljordan
Copy link
Contributor

rauljordan commented Aug 30, 2022

💎 Issue

Background

As part of our foray into improving our prometheus metrics for monitoring the merge, we have some more things we want to track in a more granular fashion in Prysm. This is a tracking issue for those items.

Global Blockchain Health

P2P Insights

  • Gossipsub scoring by topic and node (gauge)
  • Connected peer score distributions (gauge)
  • Inbound / outbound bandwidth libp2p measured in bytes/s (gauge)
  • Average peer score per client (gauge) (to be displayed as pie chart)
  • Connected peers by client type (gauge) (to be displayed as pie chart)
  • P2P RPC requests per slot (gauge)
  • Attestation errors per slot (gauge)
  • Subscribed peers per topic as pie chart (gauge)
  • Non-timeout RPC errors in p2p (counter)

Execution Client Insights

  • Number of valid, invalid, and syncing execution payload statuses (count)

    • new_payload_invalid_node_count
    • new_payload_valid_node_count
    • new_payload_optimistic_node_count
    • forkchoice_updated_valid_node_count
    • forkchoice_updated_optimistic_node_count
    • forkchoiceUpdatedInvalidNodeCount Add new metrics #11374
  • Transaction count per slot (gauge) Add new metrics #11374

    • txs_per_slot_count
  • Error counts for engine calls Add new metrics #11374

    • execution_parse_error_count and more below that line...
Code Message Meaning
-32700 Parse error Invalid JSON was received by the server.
-32600 Invalid Request The JSON sent is not a valid Request object.
-32601 Method not found The method does not exist / is not available.
-32602 Invalid params Invalid method parameter(s).
-32603 Internal error Internal JSON-RPC error.
-32000 Server error Generic client error while processing request.
-38001 Unknown payload Payload does not exist / is not available.
-38002 Invalid forkchoice state Forkchoice state is invalid / inconsistent.
-38003 Invalid payload attributes Payload attributes are invalid / inconsistent.

Performance Insights

  • Gossip verification times
    • Verification time processing rates before re-gossiping for atts and blocks (histogram)
    • Aggregate attestation gossip verification time (histogram)
    • Aggregate atts processed per minute (histogram)
    • Aggregate atts success rate (histogram)
  • Saving to DB times
    • Block saving to DB performance (histogram)
    • State performance (histogram)
  • Reads from DB times
    • Blocks (histogram)
    • States (histogram)
  • Block processing total time
    • IO (histogram)
    • State transition times (histogram)
    • Gossip verification time (histogram)
  • Attestation processing times (histogram 5m rate of sum / 5m rate of count) Add new metrics #11374
  • Forkchoice processing times (histogram) Add new metrics #11374
    • What are specific things to split up from this ? Maybe more granular processing time histograms?
  • Block assembly times from our validator clients (histogram) Terence: use grpc metric GetBeaconBlock
  • Attestation aggregation latencies (histogram) Terence: use grpc metric SubmitAggregateSelectionProof
  • State regeneration latencies (histogram)
  • Validator builder response times (histogram) under builder/metric.go
@rauljordan rauljordan self-assigned this Aug 30, 2022
@rauljordan rauljordan added the Merge PRs related to the great milestone the merge label Aug 30, 2022
@terencechain
Copy link
Member

I'm working on these

Attestation processing times (histogram 5m rate of sum / 5m rate of count)
Forkchoice processing times (histogram)
What are specific things to split up from this ? Maybe more granular processing time histograms?

Block assembly times from our validator clients (histogram)
Attestation aggregation latencies (histogram)
Validator builder response times (histogram)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merge PRs related to the great milestone the merge
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants