Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RelayMiner]: add proxy.Ping(...) capability to test connectivity between relay servers and backend URLs #1037

Open
wants to merge 47 commits into
base: main
Choose a base branch
from

Conversation

eddyzags
Copy link

Summary

This PR adds the capability to test the connectivity between the Relay Servers and the Backend URLs in two ways.

  1. Safeguard at Startup:
    For every suppliers.[].service_config.backend_url referenced as input inside the Relay Miner Configuration file, the Relay Proxy will verify wether the network connection between the targeted backend_url and the relayerminer process is functioning properly. If one or more connections aren't possible, the relay miner won't be able to start.

  2. Configurable Ping HTTP server:
    The Relay Miner process will listen for incoming request to synchronously test the connectivity of every referenced suppliers.[].service_config.backend_url. If one or more backend URLs aren't reachable, the incoming request will fail.

Based on the serverConfig.ServerType (Example: HTTP), each Server Type will implement their own logic to implement to test the connectivity.

Issue

Type of change

Select one or more:

  • New feature, functionality or library
  • Bug fix
  • Code health or cleanup
  • Documentation
  • Other (specify)

Testing

Documentation changes (only if making doc changes)

  • make docusaurus_start; only needed if you make doc changes

Local Testing (only if making code changes)

  • Unit Tests: make go_develop_and_test
  • LocalNet E2E Tests: make test_e2e
  • See quickstart guide for instructions

PR Testing (only if making code changes)

  • DevNet E2E Tests: Add the devnet-test-e2e label to the PR.
    • THIS IS VERY EXPENSIVE, so only do it after all the reviews are complete.
    • Optionally run make trigger_ci if you want to re-trigger tests without any code changes
    • If tests fail, try re-running failed tests only using the GitHub UI as shown here

Sanity Checklist

  • I have tested my changes using the available tooling
  • I have commented my code
  • I have performed a self-review of my own code; both comments & source code
  • I create and reference any new tickets, if applicable
  • I have left TODOs throughout the codebase, if applicable

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced a new configuration section for the ping functionality, allowing users to test backend connectivity within the relay miner's setup.
    • Added methods to handle ping requests, enhancing health check capabilities for relay servers.
  • Bug Fixes

    • Improved error handling during the server startup process if any relay server is unreachable.
  • Tests

    • Added tests for the new ping functionality to ensure operational integrity and reliability of the relay miner.

…ty between relay servers and backend URLs (#1)

* relayer: add RelayServers() method to RelayProxy interface; Add Ping(), ServiceIDs(), Forward() method to RelayServer interface; add RelayServers slice with helper method byServiceID

* relayer: add forward config entry

* relayer: implement ServiceIDs, Forward, and Ping method for synchrounous RPC server

* relayer: add RelayServers implementation for RelayProxy

* relayer: add Ping and Forward options

* relayer: integrate ping option

* relayer: add ServePing and ServeForward method to RelayMiner

* test proxy.Ping() in test + remove forward feature

* add serve ping test

* add doc
@Olshansk Olshansk added tooling Tooling - CLI, scripts, helpers, off-chain, etc... community A ticket intended to potentially be picked up by a community member nice-to-have Not-important and not-urgent labels Jan 22, 2025
@Olshansk Olshansk added this to the Beta TestNet Iteration milestone Jan 22, 2025
Copy link
Contributor

@bryanchriswhite bryanchriswhite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Thanks for picking this back up @eddyzags! 🙌

I have to stop here for today but this is looking great so far! 🚀
The biggest thing I haven't reviewed yet is the test (but I already saw the addition of go-mockdns, and I skimmed the test names 😉) and am looking forward to it.

Copy link
Contributor

@bryanchriswhite bryanchriswhite Jan 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this change intentionally persisted, and if so, how is it related to this feature?

I think this change should be reverted. My assumption is that this is the result of an older commit which was never reconciled completely with main:

  1. The yaml files referenced don't exist.
  2. The flags seem to be specifying the same/similar config as what's been removed from the relayminer configs that do exist. 🤔

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I wasn't clear in my previous comments.

Was this change intentionally persisted, and if so, how is it related to this feature?

Yes, this change was intentionally made to ensure the Ping safeguard at startup succeeds for the Relayminer with the localnet default configuration, and/or any custom localnet configuration in that regard (link to localnet default configuration in the main branch). In the default localnet configuration, the Ollama Kubernetes deployment is not applied (ollama.enabled=false). However, the relayminer configuration still referenced Ollama suppliers in its configuration files, even though the container wasn’t deployed (link to relayminer-1 configuration for localnet). With the newly introduced mechanism of the Ping safeguard at startup, this will cause the relayminer to fail continuously because the Ollama container isn't deployed.

To solve this issue, I found a way to dynamically define the relayminer's configuration based on the localnet configuration by modifying the poktrolld/Tiltfile. Hence, those modifications.

For poktrolld users that are deploying a Relayminer without relying on the localnet, they will have to make sure that their config.suppliers[*].service_config.backend_url are up and running and reachable before deploying a Relayminer.

The yaml files referenced don't exist.

I disagree, they exists:

The flags seem to be specifying the same/similar config as what's been removed from the relayminer configs that do exist.

I cannot find that. Can you link me to the precise line in my fork that makes you think that please? 🙏🏾

Tiltfile Outdated Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that these relayminer configs (1-3) should be reverted.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment here #1037

docusaurus/docs/operate/configs/relayminer_config.md Outdated Show resolved Hide resolved
pkg/relayer/proxy/synchronous.go Outdated Show resolved Hide resolved
pkg/relayer/relayminer_test.go Outdated Show resolved Hide resolved
pkg/relayer/relayminer_test.go Outdated Show resolved Hide resolved
pkg/relayer/relayminer_test.go Outdated Show resolved Hide resolved
pkg/relayer/relayminer_test.go Outdated Show resolved Hide resolved
server.Handler = http.HandlerFunc(func(w http.ResponseWriter, _ *http.Request) {
sendJSONRPCResponse(test.t, w)
})
listener, err := net.Listen("tcp", supplierConfig.ServiceConfig.BackendUrl.Host)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why separate the listener from the server?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using a custom listener, and thereby decoupling the listener from the serve action, we ensure that the HTTP server is fully prepared to listen on a specific port in the test's main Go routine. This guarantees that the HTTP server(s) is ready before proceeding to the actual test cases.

Previously, listening and serving were handled within the Go routine using http.ListenAndServe function. This approach sometimes led to the HTTP server not being ready when the test cases began execution, resulting in test failures and flaky behavior.

Copy link
Contributor

@bryanchriswhite bryanchriswhite Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing! 👍 #PUC with that explanation, perhaps condensed, if possible.

@eddyzags
Copy link
Author

Thanks for reviewing @bryanchriswhite ! Waiting for the rest of the review 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community A ticket intended to potentially be picked up by a community member nice-to-have Not-important and not-urgent tooling Tooling - CLI, scripts, helpers, off-chain, etc...
Projects
Status: 👀 In review
Development

Successfully merging this pull request may close these issues.

3 participants