Prevent the bootstrap command from leaving root credentials unrecoverable #461

eric-maynard · 2024-11-21T03:28:11Z

Description

This alters the bootstrap command to require either explicit credentials or a new flag --print-credentials.

Fixes #219

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
Documentation update
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

Credentials are now printed during bootstrap when it's enabled:

realm: default-realm root principal credentials: 2b98107557bcce20:f74281319ac8519ef30cbced6563223b

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java

eric-maynard · 2024-11-25T19:30:46Z

Hey @dimas-b, do you mind taking a look now that #422 has merged?

I think the integration is easy enough with some slight refactoring to PrincipalSecretsGenerator.

I left the current behavior wrt. using env variables even when printing is enabled, since if that's what the user decides to explicitly configure we can respect it. In the worst case we are just echoing env variables.

polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java

collado-mike · 2024-11-26T17:03:49Z

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java

Similar here - why is the metastore aware of whether the secrets were provided by environment variables? What if there are other impls of secrets generators that don't rely on env variables? E.g., we could have one that calls AWS SecretsManager to dynamically generate and store the secrets without any env variables. Should this code throw an exception?

...ore/src/main/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGenerator.java

...is-core/src/test/java/org/apache/polaris/core/persistence/PrincipalSecretsGeneratorTest.java

...src/test/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGeneratorTest.java

...e/polaris/extension/persistence/impl/eclipselink/PolarisEclipseLinkMetaStoreSessionImpl.java

flyrain · 2024-12-09T05:17:53Z

It seems odd that Polaris determines whether bootstrapping has failed based on a configuration controlling whether credentials are printed. IIUC, #438 removed plain text secrets from the metastore, meaning these secrets cannot be retrieved unless they are printed in the console. Would it be more reasonable to always print the credentials if they are generated by Polaris? This ensures the secrets remain accessible when needed without relying on an external configuration.

eric-maynard · 2024-12-09T06:17:18Z

It seems odd that Polaris determines whether bootstrapping has failed based on a configuration controlling whether credentials are printed.

The issue at hand is that currently credentials are unrecoverable after bootstrapping, which needs to be fixed ASAP.

IIUC, #438 removed plain text secrets from the metastore, meaning these secrets cannot be retrieved unless they are printed in the console. Would it be more reasonable to always print the credentials if they are generated by Polaris? This ensures the secrets remain accessible when needed without relying on an external configuration.

@collado-mike expressed concern about an approach like this some time ago. I think a configuration, or perhaps better a CLI argument to the bootstrap command, is a good compromise in that it allows a secure behavior by default (e.g. no secrets to stdout) but also gives people an "out" in case they want to use polaris-generated credentials with a metastore that doesn't support retrieving credentials.

This last point is also very important to consider: some metastore implementations could allow secrets to be retrieved, in which case it's okay to bootstrap without printing credentials. The problem is that after #438 EclipseLink does not allow this.

dimas-b · 2024-12-09T14:36:20Z

I think it is generally not a good idea to store retrievable secrets in the metastore. If we want that functionality it would probably be preferable to integrate with well-known secret manages (e.g. k8s secrets, cloud-specific secret managers, Vault, etc.).

snazy · 2024-12-09T15:09:20Z

I think it is generally not a good idea to store retrievable secrets in the metastore.

Completely agree with this. I'd extend this even to not put any credentials into a server log at all, because that information is not just "ephemeral on a console window", but logs can easily go into 3rd party systems, which would then make those clear text credentials easily accessible.

dimas-b · 2025-01-23T20:05:38Z

@eric-maynard : GH still shows a conflict on this PR... Would you mind resolving it before the next review round?

eric-maynard · 2025-01-23T20:34:28Z

Bulk-resolved some comments as this is a rewrite. PTAL @dimas-b, @collado-mike, @flyrain

dimas-b

The PR description mentions fixing #450, but it does not look like it does anything about the JDBC schema... Is #450 still related? 🤔 If not, I think it is totally fine, I just wanted to make sure the description is accurate.

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java

dimas-b · 2025-01-24T01:08:38Z

quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java

while I agree that supporting well-formed JSON is valuable, I believe the value is realized mostly in automated use case. This old option, however, is easier to use for humans, IMHO... Why not keep both options available for users to choose what it convenient for them?

For example: --json for JSON, and --client-id, --client-secret for plain strings?

This old option, however, is easier to use for humans, IMHO...

I really do not think that is the case for the current format.

As for supporting both the old and the new format (and maybe field-specific args like you suggest), I think we're best off just having one way to do the same thing here.

I did not mean the old "format", but the old approach of using plain text for ID/secret :) That's what I suggest: add new options (--client-id, --client-secret) to allow the old way of using plain text values to bootstrap.

... and keep the new --json option too, of course.

But with --client-id etc., how does multi-realm support work?

Since bootstrap is a one-time operation, I feel OK with a "heavier" invocation so long as it is very human-readable.

I'm suggesting to delegate the choice of which way to go to the user.

Option 1:
bootstrap --realm r1 --client-id c1 --client-secret s1
[... do something else ...]
bootstrap --realm r2 --client-id c2 --client-secret s2
etc....

Option 2:
bootstrap --json '{...}' - here all realm IDs and client ID/secret pairs are in the JSON.

Just to clarify: Option 1 only bootstraps one realm at a time, correct? I think I can live with that.

FYI, I'm working on #878... I'm leaning towards a file in the following syntax:

{ "credentials" : { "realm1" : { "client-id" : "client1", "client-secret" : "secret1" }, "realm2" : { "client-id": "client2", "client-secret": "secret2" } } }

YAML will also be supported:

credentials: realm1: client-id: client1 client-secret: secret1 realm2: client-id: client2 client-secret: secret2

One last comment, sorry... I think the admin tool should require the user to enter credentials for each realm to bootstrap. In real life, it makes little sense to bootstrap with random credentials. Wdyt? At the least the file syntax for now requires the credentials to be specified.

bootstrap --json feels a little unclear to me; e.g. should the --print-credentials option also go inside that JSON? It seems to imply that we are bootstrapping in JSON "mode" or providing all options into a JSON. While in fact, the JSON is just a set of credentials.

Since we call it credentials, I think having a second way to do the exact same thing could be confusing, i.e. having both a credentials option and a client-id option.

--print-credentials is just a CLI option. I do no think it should be inside JSON.

As for --json being unclean, how about:

Option 1: bootstrap --realm ... [ --client-id ... --client-secret ...] (as above)
Option 2: bootstrap --realm-json <json>, where <json> is from from @adutra 's example
Option 3: bootstrap --realm-yaml ... - same as JSON, but in YAML format

quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java

dimas-b · 2025-01-24T01:14:17Z

quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java

This option implies that credentials are generated, right? Could you mention that in the description?

Also, could you add a test case for this to BootstrapCommandTest?

Currently, it also prints user-provided credentials

Right, but I guess the intended use case is to print generated credentials (mostly). It would be nice to inform users about that. It may not be obvious to all users that Polaris will generate client ID/secret if they are not explicitly provided.

It may not be obvious to all users that Polaris will generate client ID/secret if they are not explicitly provided.

Oh, this is a good point. I'll add a note on this, though I'm not sure the print-credentials flag is the right place for that.

Ultimately, I want to keep the behavior of the flags "simple", so having print-credentials just "print credentials" with no qualifiers appeals to me. Also, FWIW, it does not print out exactly the same clientId:clientSecret tuple you pass in. See the note in BootstrapCommandTest.

eric-maynard · 2025-01-27T18:41:27Z

quarkus/admin/src/test/java/org/apache/polaris/admintool/BootstrapCommandTest.java

This looks like:

realm: realm1 root principal credentials: root:c15196133e4348d64c7f478dca57e99e

snazy · 2025-01-27T19:15:51Z

quarkus/admin/src/test/java/org/apache/polaris/admintool/BootstrapCommandTest.java

I don't think users will get the JSON syntax on the command line right in every circumstance. The previous approach at least allowed to use quotes around arguments, if even necessary, w/o the need to think about the JSON syntax.

Automated tooling would have to deal with escaping of special characters.
" and \ and * are possible in secrets - but escaping those for JSON and then for the the command line is quite tricky to get right.

The new JSON format exactly matches the format used in the -f argument, so hopefully that makes this less treacherous for users

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java

quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java

eric-maynard · 2025-02-18T01:58:02Z

@adutra is it correct that after #605 we no longer have the ability to bootstrap to the default realm like before?

That is, realm always must be specified now?

adutra · 2025-02-18T10:31:17Z

@adutra is it correct that after #605 we no longer have the ability to bootstrap to the default realm like before?

That is, realm always must be specified now?

As of today:

If using the in-memory metastore, all realms listed in polaris.realm-context.realms are bootstrapped automatically.
a. The environment variable POLARIS_BOOTSTRAP_CREDENTIALS or the system property polaris.bootstrap.credentials can be used to provide root credentials.
If using other metastores, no realm is automatically bootstrapped.
a. To bootstrap realms in this case, the only option right now is to use the admin tool.
b. With the admin tool, root credentials can be provided as command arguments, or in a json/yaml file.

eric-maynard · 2025-02-18T18:47:22Z

@adutra so on this note:

a. To bootstrap realms in this case, the only option right now is to use the admin tool.
b. With the admin tool, root credentials can be provided as command arguments

This was true before (insofar as you call the Dropwizard bootstrap command the "admin tool"), but you could also just call bootstrap with no args. Where did that behavior go? Is it gone? Should it be gone?

dimas-b · 2025-02-21T04:49:21Z

quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java

What is the reason for removing the option for the user to specify plain text credentials as command line args?

Refer to my comments and -1 on #633 for more context, but essentially I don't think this format captures the full range of ways you might wish to bootstrap. I also don't think we really want two syntaxes for bootstrapping.

With that being said, I'd like to understand why the pre-Quarkus bootstrap behavior is now gone, since being able to bootstrap a realm without specifying credentials is a core part of why this heavier syntax makes sense.

I believe the existing -c option is adequate for bootstrapping current Polaris Server and it is convenient for users.

I think we should keep it, but I'm open to altering its format as long as it is human-writable in shells without too much quoting.

dimas-b · 2025-02-21T04:53:34Z

...ris-core/src/main/java/org/apache/polaris/core/persistence/bootstrap/RootCredentialsSet.java

fromUrl() is already able to parse this kind of JSON... Why add a new parsing method? Why not reuse that code?

Agreed, but I wanted to showcase the behavior around principal that was discussed above; it's not clear to me how to add that logic into fromUrl. If there's an easy way, happy to just make that change now

How about extracting a private shared method from fromUrl that would take a JsonFactory and an InputStream?

While I agree that a shared method would be ideal, I don't see how we can keep this principal-check logic in fromJson but remove it from fromUrl while sharing a meaningful amount of code across the two methods. If it's okay with you though, I'm happy just to push this logic down into fromUrl and then share the code that way.

Otherwise, we can also remove this check altogether if you think it shouldn't be there, or we can keep the divergent parsing logic. I don't have a strong opinion here.

polaris-core/src/main/java/org/apache/polaris/core/entity/PolarisEntityConstants.java

adutra · 2025-02-25T13:40:41Z

bootstrap with no args. Where did that behavior go? Is it gone? Should it be gone?

I missed that I confess. What the bootstrap command without args used to do? Bootstrap the configured default realm with random credentials?

If so, we might have a small problem because the notion of "default realm" is tied to the actual implementation of RealmContextResolver. If someone is using a different resolver, there might not be any "default realm".

eric-maynard · 2025-02-25T20:49:24Z

I missed that I confess. What the bootstrap command without args used to do? Bootstrap the configured default realm with random credentials?

Yeah that is exactly right. I suppose if there is no default realm, failing to bootstrap with no realm might be correct.

The silver lining is, with this behavior gone this PR is less urgent. It is still possible to bootstrap a realm with random credentials though, and we should still make sure those credentials are either printed or recoverable.

eric-maynard · 2025-02-28T22:21:22Z

For now, let me refactor out the controversial syntax changes, so we can look at just the changes necessary to prevent bricking the metastore.

dimas-b

LGTM 👍

dimas-b · 2025-02-28T23:00:37Z

quarkus/admin/src/test/java/org/apache/polaris/admintool/BootstrapCommandTest.java

+      value = {"bootstrap", "-r", "realm1", "-c", "realm1,client1d,s3cr3t", "--print-credentials"})
+  public void testPrintCredentials(LaunchResult result) {
+    assertThat(result.getOutput()).contains("Bootstrap completed successfully.");
+    assertThat(result.getOutput()).contains("realm: realm1 root principal credentials: client1d:");


Why not assert that s3cr3t is printed?

It's actually not printed -- maybe it should be. Because of #801 the secret supplied becomes the secondary secret while the primary secret is printed.

I would rather we just solve #801 rather than add special logic around printing the secondary secret here

TIL. I think this PR is good to merge then, and we'll follow-up on #801

collado-mike

I love this. It's so simple

…able (apache#461) * rebase * autolint

eric-maynard mentioned this pull request Nov 21, 2024

Support providing root client ID via env. variables when bootstrapping #422

Merged

dimas-b reviewed Nov 21, 2024

View reviewed changes

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java Outdated Show resolved Hide resolved

dimas-b reviewed Nov 21, 2024

View reviewed changes

...e/src/main/java/org/apache/polaris/core/persistence/LocalPolarisMetaStoreManagerFactory.java Outdated Show resolved Hide resolved

eric-maynard requested a review from dimas-b November 25, 2024 19:31

eric-maynard marked this pull request as ready for review November 25, 2024 19:31

eric-maynard requested review from RussellSpitzer, adutra, ashvina, collado-mike, ebyhr, flyrain, jackye1995, jbonofre, snazy, takidau and vvcephei as code owners November 25, 2024 19:31

dimas-b reviewed Nov 25, 2024

View reviewed changes

polaris-core/src/main/java/org/apache/polaris/core/persistence/PrincipalSecretsGenerator.java Outdated Show resolved Hide resolved

collado-mike reviewed Nov 26, 2024

View reviewed changes

dimas-b reviewed Nov 26, 2024

View reviewed changes

...ore/src/main/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGenerator.java Outdated Show resolved Hide resolved

dimas-b reviewed Nov 26, 2024

View reviewed changes

...ore/src/main/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGenerator.java Outdated Show resolved Hide resolved

dimas-b reviewed Nov 27, 2024

View reviewed changes

...is-core/src/test/java/org/apache/polaris/core/persistence/PrincipalSecretsGeneratorTest.java Outdated Show resolved Hide resolved

dimas-b reviewed Nov 27, 2024

View reviewed changes

...src/test/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGeneratorTest.java Outdated Show resolved Hide resolved

...src/test/java/org/apache/polaris/core/persistence/secrets/PrincipalSecretsGeneratorTest.java Outdated Show resolved Hide resolved

dimas-b approved these changes Nov 27, 2024

View reviewed changes

flyrain reviewed Dec 6, 2024

View reviewed changes

...e/polaris/extension/persistence/impl/eclipselink/PolarisEclipseLinkMetaStoreSessionImpl.java Outdated Show resolved Hide resolved

eric-maynard requested a review from flyrain December 6, 2024 21:33

eric-maynard force-pushed the no-brick-metastore branch from df1d642 to 6b2486e Compare January 23, 2025 18:55

eric-maynard requested a review from dennishuo as a code owner January 23, 2025 18:55

eric-maynard changed the title ~~Add a flag to control whether credentials are printed during bootstrapping~~ Prevent the bootstrap command from leaving root credentials unrecoverable Jan 23, 2025

eric-maynard requested a review from dimas-b January 23, 2025 20:37

dimas-b reviewed Jan 24, 2025

View reviewed changes

eric-maynard requested a review from MonkeyCanCode as a code owner January 27, 2025 17:58

eric-maynard commented Jan 27, 2025

View reviewed changes

snazy reviewed Jan 27, 2025

View reviewed changes

dimas-b reviewed Jan 27, 2025

View reviewed changes

polaris-core/src/main/java/org/apache/polaris/core/persistence/PolarisCredentialsBootstrap.java Outdated Show resolved Hide resolved

dimas-b mentioned this pull request Feb 4, 2025

Ability to read root credentials from file when bootstrapping #940

Merged

sfc-gh-ygu reviewed Feb 17, 2025

View reviewed changes

quarkus/admin/src/main/java/org/apache/polaris/admintool/BootstrapCommand.java Outdated Show resolved Hide resolved

dimas-b self-requested a review February 21, 2025 04:42

dimas-b reviewed Feb 21, 2025

View reviewed changes

eric-maynard added 2 commits February 28, 2025 14:24

rebase

af50be5

autolint

e9e5305

eric-maynard force-pushed the no-brick-metastore branch from 1ea2e90 to e9e5305 Compare February 28, 2025 22:27

dimas-b approved these changes Feb 28, 2025

View reviewed changes

github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Feb 28, 2025

collado-mike approved these changes Mar 3, 2025

View reviewed changes

eric-maynard merged commit 1a03108 into apache:main Mar 3, 2025
5 checks passed

github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Mar 3, 2025

gh-yzou pushed a commit to gh-yzou/polaris that referenced this pull request Mar 6, 2025

Prevent the bootstrap command from leaving root credentials unrecover…

53673f0

…able (apache#461) * rebase * autolint

Prevent the bootstrap command from leaving root credentials unrecoverable #461

Prevent the bootstrap command from leaving root credentials unrecoverable #461

Uh oh!

Conversation

eric-maynard commented Nov 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Uh oh!

Uh oh!

Uh oh!

eric-maynard commented Nov 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

flyrain commented Dec 9, 2024

Uh oh!

eric-maynard commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dimas-b commented Dec 9, 2024

Uh oh!

snazy commented Dec 9, 2024

Uh oh!

dimas-b commented Jan 23, 2025

Uh oh!

eric-maynard commented Jan 23, 2025

Uh oh!

dimas-b left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dimas-b Jan 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eric-maynard Jan 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

eric-maynard commented Nov 21, 2024 •

edited

Loading

eric-maynard commented Nov 25, 2024 •

edited

Loading

eric-maynard commented Dec 9, 2024 •

edited

Loading

dimas-b Jan 24, 2025 •

edited

Loading

eric-maynard Jan 27, 2025 •

edited

Loading

eric-maynard commented Feb 18, 2025 •

edited

Loading