Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(server): Use global config from file if provided #2458

Merged
merged 41 commits into from
Sep 12, 2023
Merged

Conversation

TBS1996
Copy link
Contributor

@TBS1996 TBS1996 commented Sep 4, 2023

Provides a way to load global config statically from a file.

  • If the relay mode is in managed mode, this file will be ignored as we want in that case to fetch from upstream.
  • If it's not in managed mode, it will attempt the load the file, if it exists it will serve it, if it doesn't exist, it will serve the default config. If it exists but we fail to open/parse it, an error will be logged and we will serve the default.

The location of the static config will be in the same folder as config.yml and credentials.json.

@@ -1250,6 +1252,7 @@ impl ConfigObject for ConfigValues {
pub struct Config {
values: ConfigValues,
credentials: Option<Credentials>,
global_config: Option<Arc<GlobalConfig>>,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't 100% sure if i should put it here or not. My reasoning was that if we fail to parse the file, it's better to fail early than later, and it just seems idiomatic within the codebase that we don't do these kind of risky IO-stuff in the services.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a precedent where we defer the deserialization of a static config file to a service:

/// Get filename for static project config.
pub fn project_configs_path(&self) -> PathBuf {
self.path.join("projects")
}

That service reloads the config every 10 seconds though, which I think is overkill. In short, I think your way of doing it is better because it fails early. I'll leave final review to @jan-auer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I concur with this: We can load the global configs directly from within the actual service like we do it for project configs. That keeps the responsibility of the relay-config crate lower. The overall implementation in this PR does look good, so it would just have to be moved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright, I moved it now to the spawn handler, where I'll log an error if the file exists yet it fails to load it.

@TBS1996 TBS1996 marked this pull request as ready for review September 4, 2023 06:33
@TBS1996 TBS1996 requested a review from a team September 4, 2023 06:33
@TBS1996 TBS1996 self-assigned this Sep 4, 2023
Comment on lines 160 to 163
relay_log::info!("global config service starting");
// This request will trigger the request intervals when internal_rx receives the
// result from upstream.
Self::request_global_config(upstream.clone(), internal_tx.clone());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not make requests during initialization, as this can block relay startup. We should only do requests once the service is actually started, see my comment below in the spawn_handler.

relay-server/src/actors/global_config.rs Show resolved Hide resolved
Comment on lines 300 to 301
// Start timer for making new global config request.
self.schedule_fetch();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved the timer reset and scheduling away from the functions to make it more explicit, I think it seems a lot more clear here, lmk if any disagreements

Comment on lines 283 to 285
(RelayMode::Proxy | RelayMode::Static | RelayMode::Capture, _, None) => {
relay_log::info!(
"serving default global configs due to lacking static global config file"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big fan of the implicit nature here, Id prefer we create the watch here, but then we'd have to pass it into handle_message and handle_upstream_result.

Copy link
Contributor

@iker-barriocanal iker-barriocanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd merge this as-is, but I'm biased as I've also worked on this so I'll leave the approval to the team.

Comment on lines 267 to 269
relay_log::info!(
"serving default global configs due to lacking credentials"
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doing nothing means serving default config, because we initiate the watch with default in new();

Comment on lines 1468 to 1472
/// Updates the credentials without creating the file, for testing purposes.
pub fn update_credentials(&mut self) {
let creds = Credentials::generate();
self.credentials = Some(creds);
}
Copy link
Contributor Author

@TBS1996 TBS1996 Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to conditionally compile such functions only on testing? I tried the cfg(test) feature, but when i run make test-rust-all it fails to find the function, despite that it should compile in testing mode. Strangely, running the tests directly from vscode works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also made some attempts playing around with features and didn't manage. I'm open to updating this if somebody knows a better way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The#[cfg(test)] does not propagate to the external crates, and only work within the current one.

It's possible to achieve this, if really really necessary with features, by e.g. adding testing feature to the crate and hiding all the functionality behind that feature. Then in the crate which has to use this testing functionality you will need to add maybe dev dependency and enable the feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now addressed by two subsequent changes:

  • At first, we introduced a test feature like @olksdr described above
  • Later, we updated the existing regenerate_credentials method and added a flag to opt out of saving to a file.

CHANGELOG.md Outdated Show resolved Hide resolved
relay-dynamic-config/src/global.rs Outdated Show resolved Hide resolved
relay-dynamic-config/src/global.rs Show resolved Hide resolved
relay-dynamic-config/src/global.rs Outdated Show resolved Hide resolved
relay-dynamic-config/src/global.rs Outdated Show resolved Hide resolved
Comment on lines 265 to 266
RelayMode::Managed => {
if self.config.has_credentials() {
Copy link
Member

@jan-auer jan-auer Sep 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a thought for simplifying this, what if we simply match for managed mode and catch the rest collectively?

if config.relay_mode() == RelayMode::Managed && config.has_credentials() {
    // ...
} else {
    // ...
}

The case that managed mode is enabled but there are no credentials is caught during startup, so that would be unreachable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I think being explicit provides more benefits, mostly ease of understanding and being more defensive to changes.

  • It's harder to understand the case where relay mode is managed and there aren't credentials, besides that it requires prior knowledge of how relay startup works. We could add a comment, but I don't think that's a good alternative to avoid an if branch.
  • Changes to relay startup don't impact the global config service negatively.
  • The exhaustive match on relay modes ensures that we must consider the global config service if adding new modes to relay.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a concern of the global config service to understand whether or not credentials are needed. It only needs to know that if it's in managed mode, it should make requests. Whether or not all preconditions for upstream are met to make requests is outside of its scope.

The issue with adding this here is that it duplicates validation logic or invariants across the codebase, which makes them hard to maintain or change. It creates tighter-than-necessary coupling.

tokio::time::pause();

let (upstream, handle) = mock_service("upstream", (), |(), _| {
panic!();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of #[should_panic], which is intended to be used for expected panics, we should rather make sure the upstream service returns a default config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, you suggest returning a response as if it were the actual upstream service. To actually return a global config we need to do an async operation in the closure, currently unstable in Rust. I think the panic makes the test easy to understand, and removes the complexity of dealing with mock_service. Maybe I'm missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm here with @jan-auer . The thing is with unit tests, it can also panic for completely different reason when something changes in the code, and therefore the tests will be testing completely wrong thing.

It's better to make sure the behaviour is correct, but checking the status of some operation, like in this case that resulting config is default

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TBS1996 and I spent some time on this today. In order to move forward with this PR, let's keep the should_panic for now. He'll link a follow-up issue here to clean this up.

We'll implement some utilities to make mocking upstream requests easier. The boilerplate needed to achieve this now for a single test is too complex.

Comment on lines 1468 to 1472
/// Updates the credentials without creating the file, for testing purposes.
pub fn update_credentials(&mut self) {
let creds = Credentials::generate();
self.credentials = Some(creds);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The#[cfg(test)] does not propagate to the external crates, and only work within the current one.

It's possible to achieve this, if really really necessary with features, by e.g. adding testing feature to the crate and hiding all the functionality behind that feature. Then in the crate which has to use this testing functionality you will need to add maybe dev dependency and enable the feature.

relay-dynamic-config/src/global.rs Outdated Show resolved Hide resolved
relay-dynamic-config/src/global.rs Show resolved Hide resolved
tokio::time::pause();

let (upstream, handle) = mock_service("upstream", (), |(), _| {
panic!();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm here with @jan-auer . The thing is with unit tests, it can also panic for completely different reason when something changes in the code, and therefore the tests will be testing completely wrong thing.

It's better to make sure the behaviour is correct, but checking the status of some operation, like in this case that resulting config is default

match GlobalConfig::load(self.config.path()) {
Ok(Some(from_file)) => {
relay_log::info!("serving static global config loaded from file");
self.global_config_watch.send(Arc::new(from_file)).ok();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the docs on subscribe

All messages sent before this call to subscribe are initially marked as seen by the new Receiver.

Can it happen, that e.g. EnvelopeProcessorService will never gets update over the watch, since it will subscribe to the watch after this static config is sent? And then we actually never send any other update?

Or maybe I'm missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Every subscriber must use the initial value that the subscription has before awaiting changed(). Essentially, this would be the flow:

let subscription = global_config.send(Subscribe)?;
self.config = subscription.borrow().clone();

while let Ok(()) = subscription.changed().await {
  self.config = subscription.borrow().clone();
}

or

let subscription = global_config.send(Subscribe)?;

loop {
  self.config = subscription.borrow().clone();
  subscription.changed().await?;
}


loop {
tokio::select! {
biased;

() = &mut self.fetch_handle => self.update_global_config(),
() = &mut self.fetch_handle => self.request_global_config(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the static mode we actually never get this fetch_handle trigger? Is it right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right. The fetch handle is only scheduled when handling a request, and a request is only made in managed modes; thus, in static mode, this fetch handle is never resolved.

@TBS1996 TBS1996 merged commit 5cb3b09 into master Sep 12, 2023
@TBS1996 TBS1996 deleted the tor/static_config branch September 12, 2023 11:31
jan-auer added a commit that referenced this pull request Sep 13, 2023
* master:
  fix(server): Immediatly set global config to watch contents. (#2505)
  feat(server): Use global config from file if provided (#2458)
  ref(metrics): Remove deprecated `Metric` type and improve docs (#2503)
  ref(projconfig): Unsupport v4 in the endpoint (#2500)
  docs(dashboard): Add instructions for required WASM target (#2502)
  release: 0.8.30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants