Skip to content

Conversation

@Fletch153
Copy link
Collaborator

@Fletch153 Fletch153 commented Jul 15, 2025

Bridge Status Reporter Service

Overview

Implements a service that polls External Adapter /status endpoints at configurable intervals and emits structured telemetry events via Beholder. Provides operational visibility into bridge health, configuration, and runtime state.

Implementation

Core Service (core/services/nodestatusreporter/bridgestatus/)

  • bridge_status_reporter.go - Service implementation with polling loop, HTTP client, error handling

Protobuf Schema (events/)

Configuration Integration

  • core/config/docs/core.toml - Added [BridgeStatusReporter] configuration section
  • core/config/toml/types.go - Configuration struct and setters
  • core/config/bridge_status_config.go - Configuration interface
  • core/services/chainlink/config_bridge_status.go - Service configuration implementation

Features

Service Behavior

  • Queries bridge registry on startup and at each polling interval
  • Polls bridge URLs with configurable status path (default: /status)
  • Continues operation when individual bridges fail
  • Configurable filtering of bridges without jobs or with HTTP errors
  • Graceful shutdown with context cancellation

Telemetry Data Structure

Emits BridgeStatusEvent containing:

  • Bridge metadata: name, adapter name/version, uptime
  • Runtime information: Node.js version, platform, architecture, hostname
  • Endpoint definitions: names, aliases, supported transports
  • Configuration parameters: name/value pairs, types, defaults
  • Associated job information: external job IDs and names

Configuration

[BridgeStatusReporter]
Enabled = false                    # Default: false
StatusPath = "/status"            # Default: "/status"
PollingInterval = "5m"            # Default: "5m"
IgnoreInvalidBridges = true       # Default: true
IgnoreJoblessBridges = false      # Default: false

Usage

Node Operators

Configure service via TOML, ensure External Adapters implement status endpoints.

External Adapter Developers

Implement GET /status endpoint returning JSON with required fields, available in adapters using ea-framework-js v2.7.0+

Monitoring Systems

Consume BridgeStatusEvent messages via Beholder telemetry pipeline.

@Fletch153 Fletch153 requested review from a team as code owners July 15, 2025 15:21
@github-actions
Copy link
Contributor

👋 Fletch153, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@Fletch153 Fletch153 removed the draft label Jul 15, 2025
@Fletch153 Fletch153 force-pushed the feature/DF-21286/add_additional_telemetry branch 2 times, most recently from 3370cfd to e3ed976 Compare July 16, 2025 21:55
@Fletch153 Fletch153 force-pushed the feature/DF-21286/add_additional_telemetry branch from e3ed976 to 9feaab9 Compare July 16, 2025 21:58
@Fletch153 Fletch153 force-pushed the feature/DF-21286/add_additional_telemetry branch from 4348137 to 31b9c0e Compare July 23, 2025 21:08
@Fletch153 Fletch153 changed the title Initial working impl Added Bridge Status Reporter Service Jul 24, 2025
@Fletch153 Fletch153 force-pushed the feature/DF-21286/add_additional_telemetry branch from 32fa3a8 to 6e746af Compare July 25, 2025 14:37
cll-gg
cll-gg previously approved these changes Jul 28, 2025
@github-actions
Copy link
Contributor

Flakeguard Summary

Ran new or updated tests between develop and 89d3650 (feature/DF-21286/add_additional_telemetry).

View Flaky Detector Details | Compare Changes

Found Flaky Tests ❌

4 Results
Name Pass Ratio Panicked? Timed Out? Race? Runs Successes Failures Skips Package Package Panicked? Avg Duration Code Owners
TestConfig_Marshal/full 0% false false false 3 0 3 0 github.com/smartcontractkit/chainlink/v2/core/services/chainlink false 16.666666ms Unknown
TestService_Close_AlreadyClosed 33.3333% false false true 3 1 2 0 github.com/smartcontractkit/chainlink/v2/core/services/nodestatusreporter/bridgestatus false 0s Unknown
TestService_Start_AlreadyStarted 0% false false true 3 0 3 0 github.com/smartcontractkit/chainlink/v2/core/services/nodestatusreporter/bridgestatus false 0s Unknown
TestService_Start_Enabled 33.3333% false false true 3 1 2 0 github.com/smartcontractkit/chainlink/v2/core/services/nodestatusreporter/bridgestatus false 0s Unknown

Artifacts

For detailed logs of the failed tests, please refer to the artifact failed-test-results-with-logs.json.

@Fletch153 Fletch153 force-pushed the feature/DF-21286/add_additional_telemetry branch from 6f6239e to 73be6f5 Compare July 28, 2025 15:26
@github-actions
Copy link
Contributor

Flakeguard Summary

Ran new or updated tests between develop and 73be6f5 (feature/DF-21286/add_additional_telemetry).

View Flaky Detector Details | Compare Changes

Found Flaky Tests ❌

1 Results
Name Pass Ratio Panicked? Timed Out? Race? Runs Successes Failures Skips Package Package Panicked? Avg Duration Code Owners
TestConfig_Marshal/full 0% false false false 3 0 3 0 github.com/smartcontractkit/chainlink/v2/core/services/chainlink false 10ms Unknown

Artifacts

For detailed logs of the failed tests, please refer to the artifact failed-test-results-with-logs.json.

@cl-sonarqube-production
Copy link

@Fletch153 Fletch153 added this pull request to the merge queue Jul 29, 2025
Merged via the queue into develop with commit df8ed63 Jul 29, 2025
212 of 214 checks passed
@Fletch153 Fletch153 deleted the feature/DF-21286/add_additional_telemetry branch July 29, 2025 15:04
patrickhuie19 pushed a commit that referenced this pull request Jul 30, 2025
* Initial working impl

* update naming from metric to status

* Fixes + update tests

* Small fixes

* Migrate JSON beholder msg to protobuf

* Add job idenfication to polling

* Include external job IDs

* Add job name to beholder output

* Rename ea status to bridge status

* Fixed issue with protobuf marshaling

* Fixed issues with beholder not correctly emitting

* Fixed issue sending nil values

* add go generate d.

* Add README

* Changeset

* Fix build issues

* go gen

* Fixed PR check issues

* Update test fixtures

* Additional test fixes

* PR fixes

* Fix racey tests

* Fix test failure

* Improved configuration reslilience
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants