Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Oximeter creates and destroys qorb pools while waiting for schema migration #7179

Open
jgallagher opened this issue Nov 27, 2024 · 0 comments
Milestone

Comments

@jgallagher
Copy link
Contributor

When Oximeter starts up and clickhouse is on an old schema version, it goes into an infinite loop waiting for an operator to run schema migration:

let agent = backoff::retry_notify(
backoff::retry_policy_internal_service(),
make_agent,
log_client_failure,
)
.await
.expect("Expected an infinite retry loop initializing the timeseries database");

However, in every iteration of this loop, we create a new qorb pool and then drop it when returning the error indicating the version isn't what we expect:

let client = Client::new_with_pool(native_resolver, &log);
match client.check_db_is_at_expected_version().await {
Ok(_) => {}
Err(oximeter_db::Error::DatabaseVersionMismatch {
found: 0,
..
}) => {
debug!(log, "oximeter database does not exist, creating");
client
.initialize_db_with_version(
replicated,
oximeter_db::OXIMETER_VERSION,
)
.await?;
}
Err(e) => return Err(Error::from(e)),

In combination with qorb versions prior to the fix in oxidecomputer/qorb#78, this results in 8 leaked TCP connections for every retry attempt.

@askfongjojo askfongjojo added this to the 12 milestone Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants