Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow creating config for attached tenant #3446

Merged
merged 15 commits into from
Jan 27, 2023

Conversation

koivunej
Copy link
Member

@koivunej koivunej commented Jan 25, 2023

This might be a bit controversial change but I thought to write the code takes about as much as writing a proper issue.

Currently attach doesn't write a tenant config, because we don't back it up in the first place. The current implementation of Tenant::persist_tenant_config does not allow changing tenant's configuration through the http api which will fail because the file wasn't created on attach and OpenOptions::truncate(true).write(true).create_new(false) is used.

I think this patch allows for least controversial middle ground which enables changing tenant configuration even for attached tenants (not just created tenants).

I also removed the conditional fsync on the directory because it's now a hazard because the file might be created when !first_save.

See also:

Does not address:

Found while trying to reproduce #3387.

@koivunej koivunej requested review from a team as code owners January 25, 2023 15:47
@koivunej koivunej requested review from MMeent and hlinnaka and removed request for a team January 25, 2023 15:47
rationale: we dont currently create tenant config file on attach,
because we don't have the old one persisted. this change is perhaps less
controversial while the config file is not backed up.
the contents are not repeated, they are always "prepended" to the
TenantConf serialization form.
that way we don't have to introduce a TOCTOU and everything will be
simpler.
@koivunej koivunej force-pushed the allow_creating_config_for_attached_tenant branch from c172257 to 6ce381d Compare January 25, 2023 15:51
@koivunej
Copy link
Member Author

Unrelated regress(release):

  • FAILED test_runner/regress/test_tenant_size.py::test_get_tenant_size_with_multiple_branches

Copy link
Contributor

@SomeoneToIgnore SomeoneToIgnore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the config-related changes always somewhat confusing, since the only real source of truth and use cases now is the console (control plane later?).

Without their specs/demands/stories, all these changes are guessing.
The PR is not bad per se, but I cannot say if it makes any sense due to the above.

pageserver/src/tenant.rs Outdated Show resolved Hide resolved
pageserver/src/tenant.rs Outdated Show resolved Hide resolved
test_runner/regress/test_tenant_conf.py Outdated Show resolved Hide resolved
test_runner/regress/test_tenant_conf.py Outdated Show resolved Hide resolved
Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I've noticed that attach doesnt create config files, but didnt realise that config changes can be broken because of that. As an alternative we can always persist default config with attach. And as I see it attach can have tenant knobs as parameters so they're added to the config during the attach itself

I find the config-related changes always somewhat confusing, since the only real source of truth and use cases now is the console (control plane later?).

Without their specs/demands/stories, all these changes are guessing.
The PR is not bad per se, but I cannot say if it makes any sense due to the above.

I think this is not really relevant here. The PR fixes a bug which is observed when you call attach and then try to change config with usual config endpoint. I can confirm that this patch fixes the issue (verified locally)

pageserver/src/tenant.rs Outdated Show resolved Hide resolved
pageserver/src/tenant.rs Outdated Show resolved Hide resolved
precondition that we have a directory to fsync.
cannot use it with the virtualfile.
rationale: context needs only verb + subject -discussion.

copied the context from tenant creation to http api.
@koivunej
Copy link
Member Author

@SomeoneToIgnore and @LizardWizzard please peek again for the fixes, tried to make isolated commits.

Workflow failure is not related.

pageserver/src/tenant/mgr.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@LizardWizzard LizardWizzard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks! Two minor comments

pageserver/src/tenant.rs Outdated Show resolved Hide resolved
test_runner/regress/test_tenant_conf.py Outdated Show resolved Hide resolved
@koivunej koivunej enabled auto-merge (squash) January 27, 2023 12:41
@koivunej koivunej merged commit 0ec84e2 into main Jan 27, 2023
@koivunej koivunej deleted the allow_creating_config_for_attached_tenant branch January 27, 2023 13:35

tenant.update_tenant_config(tenant_conf);
let tenant_config_path = conf.tenant_config_path(tenant_id);
Tenant::persist_tenant_config(&tenant.tenant_id(), &tenant_config_path, tenant_conf, false)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're changing the in-memory config and then fail to persist it, we report an error to the client but still run with the new config. Is this intentional?

Copy link
Member Author

@koivunej koivunej Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That part was not explicitly changed on this PR but yeah it seems quite suboptimal. The update_tenant_config does a merge all of the given values ... So actually the value we end up writing to the disk is not the active tenant config. Raah, I'll open an issue. Well, perhaps investigate a bit more before calling it :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying the added test in this PR to instead of changing gc_horizon twice to in different http PUT requests:

  1. set gc_horizon
  2. set pitr_interval

Will write to disk as last version:

# This file contains a specific per-tenant's config.
#  It is read in case of pageserver restart.

[tenant_config]
pitr_interval = "0s"

But adding logging to Tenant::update_tenant_config shows that the tenant config is first updated to:

2023-01-27T15:27:39.514582Z  INFO tenant_config{tenant=db6238f481f8459ce03705f1fa96ab42}: updated config TenantConfOpt {
    checkpoint_distance: None,
    checkpoint_timeout: None,
    compaction_target_size: None,
    compaction_period: None,
    compaction_threshold: None,
    gc_horizon: Some(
        1000000,
    ),
    gc_period: None,
    image_creation_threshold: None,
    pitr_interval: None,
    walreceiver_connect_timeout: None,
    lagging_wal_timeout: None,
    max_lsn_wal_lag: None,
    trace_read_requests: None,
}

Then by the second request:

2023-01-27T15:27:39.521233Z  INFO tenant_config{tenant=db6238f481f8459ce03705f1fa96ab42}: updated config TenantConfOpt {
    checkpoint_distance: None,
    checkpoint_timeout: None,
    compaction_target_size: None,
    compaction_period: None,
    compaction_threshold: None,
    gc_horizon: Some(
        1000000,
    ),
    gc_period: None,
    image_creation_threshold: None,
    pitr_interval: Some(
        0ns,
    ),
    walreceiver_connect_timeout: None,
    lagging_wal_timeout: None,
    max_lsn_wal_lag: None,
    trace_read_requests: None,
}

Creating a new issue, including your comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created two new issues, linked below.

vadim2404 pushed a commit that referenced this pull request Jan 31, 2023
Currently `attach` doesn't write a tenant config, because we don't back
it up in the first place. The current implementation of
`Tenant::persist_tenant_config` does not allow changing tenant's
configuration through the http api which will fail because the file
wasn't created on attach and
`OpenOptions::truncate(true).write(true).create_new(false)` is used.

I think this patch allows for least controversial middle ground which
*enables* changing tenant configuration even for attached tenants (not
just created tenants).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants