Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove secrets from config yaml and terraform configuration files #720

Merged
merged 12 commits into from
Jul 9, 2021

Conversation

leej3
Copy link
Contributor

@leej3 leej3 commented Jul 5, 2021

Resolves #13.

Proposal for managing secrets... this approach:

  • Makes use of the fact that terraform variables can be set by setting an environment variable with the same name but prefixed with TF_VAR_
  • Could be used for all secrets/tokens.
  • Makes it harder for secrets to leak as they are never stored as plain text.
  • Does not expose secrets during a render (I need to confirm this, I may have to define the variable as sensitive in variables.tf).
  • Can serve as an interim prior to more thorough managment of secrets using something like vault/keycloak.
  • In some cases will require the removal of some values from the schema. e.g. client_id and client_key for google oauth.

@leej3 leej3 requested review from aktech and costrouc July 5, 2021 17:19
@leej3 leej3 changed the title WIP: remove secrets from config yaml and terraform state files WIP: remove secrets from config yaml and terraform configuration files Jul 6, 2021
@leej3
Copy link
Contributor Author

leej3 commented Jul 6, 2021

Further thoughts on this with respect to breakage of schema compatibility. Currently the schema sets client_id/client_key for a number of different auth use-cases.

With the current approach we would have to remove these from the schema (breaking backwards compatibility) and in each case document the env variable that needs to be set in order to define the appropriate terraform variable.

We could alternatively use a convention whereby the current fields are filled with a placeholder value instead of the current key. For example:

security:
  authentication:
    type: custom
    authentication_class: "oauthenticator.google.GoogleOAuthenticator"
    config:
      oauth_callback_url: 'https://qhub.ouraes.com/hub/oauth_callback'
      client_id: QHUB_SECRET_google_oauth_client_id
      client_secret: QHUB_SECRET_google_oauth_client_secret
...

The above convention would trigger a search for the env variables google_oauth_client_id and google_oath_client_secret and would use them to set TF_VAR_google_oauth_client_id etc which then populates the corresponding terraform variables. This approach prevents schema breakage and could likely be backwards compatible with the current implementation of storing secrets (if we wish to continue to support that). It seems a hacky though...

For now I will continue to implement this with schema modification...

@leej3
Copy link
Contributor Author

leej3 commented Jul 7, 2021

So @costrouc pointed out that it is not just .tf files that the secrets are exposed in during the render. Sometimes it is put in other files that won't work as cleanly as we had envisaged. So for example, the auth client_id and client_key are leaked via the jupyterhub config:

data$ grep -r oops_secret .
./qhub-config.yaml:      client_id: oops_secret
./qhub-config.yaml:      client_secret: oops_secret
./infrastructure/jupyterhub.yaml:      c.GitHubOAuthenticator.client_id = "oops_secret"
./infrastructure/jupyterhub.yaml:      c.GitHubOAuthenticator.client_secret = "oops_secret"

Until we implement something more systematic for secrets management it seems like avoiding rendering these values is too challenging and if security is the imperative on a CI provider based deployment the best option is to simply not render.

@aktech
Copy link
Member

aktech commented Jul 7, 2021

So @costrouc pointed out that it is not just .tf files that the secrets are exposed in during the render. Sometimes it is put in other files that won't work as cleanly as we had envisaged. So for example, the auth client_id and client_key are leaked via the jupyterhub config:

That's true, that those SECRETS are engrained in the files. My thought was to have placeholders in those files and and replace them in terraform during deployment via terraform variables, in here:
https://github.com/Quansight/qhub/blob/08c94b91428f68d47270ba003a7461c45ef64f48/qhub/template/%7B%7B%20cookiecutter.repo_directory%20%7D%7D/infrastructure/modules/kubernetes/services/meta/qhub/main.tf#L12-L13

@leej3 leej3 force-pushed the manage_secrets_better branch from 6487857 to 56a11c4 Compare July 8, 2021 13:31
@leej3 leej3 changed the title WIP: remove secrets from config yaml and terraform configuration files remove secrets from config yaml and terraform configuration files Jul 8, 2021
@leej3
Copy link
Contributor Author

leej3 commented Jul 8, 2021

This implements both approaches based on the two use-cases:

  • The user wishes to remove secrets from the config yaml. This can be done for any field using the QHUB_SECRET_ pattern. These values will often continue to be rendered but if the user turns off rendering it goes a long way to protecting the values.
  • Due to the way the Prefect integration is implemented the required secret can be provided to terraform through the TF_VAR_prefect_token environment variable. This variable is now explicitly checked for. Overall this is a nice pattern and should be mimicked for future integrations.

@aktech's suggested is a nice addition to this PR.. .namely, we implement a extra bit of logic to replace the QHUB_SECRET placeholder after rendering (so we don't leak secrets there) but before the relevant file is provided to terraform. This may have to be implemented on a case by case basis because the relevant values might be in any file... or perhaps we could search every file for the placeholder patterns...

Copy link
Member

@aktech aktech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have mostly pointed out some nitpicks, overall looks good to me. I will take another look if you have time to address the comment here

I can then review get_secret_config_entries and it's helper functions more closely.

qhub/deploy.py Outdated Show resolved Hide resolved
qhub/deploy.py Outdated Show resolved Hide resolved
qhub/deploy.py Outdated Show resolved Hide resolved
qhub/render/__init__.py Outdated Show resolved Hide resolved
qhub/render/__init__.py Outdated Show resolved Hide resolved
tests/test_deploy.py Show resolved Hide resolved
@leej3
Copy link
Contributor Author

leej3 commented Jul 9, 2021

Sorry about the style issues. Fixed now. Tests for the get_secret_config_entries function are in tests/test_render.py or did you mean that I should add more?

@aktech
Copy link
Member

aktech commented Jul 9, 2021

Sorry about the style issues. Fixed now.

Awesome, thanks!

Tests for the get_secret_config_entries function are in tests/test_render.py or did you mean that I should add more?

Yep, I missed it apologies. Taking a look now.

@leej3 leej3 force-pushed the manage_secrets_better branch from 8e6ec1e to 9a56ad9 Compare July 9, 2021 13:24
@aktech
Copy link
Member

aktech commented Jul 9, 2021

This looks great now. One last thing, can you add some documentation, maybe here.

@aktech
Copy link
Member

aktech commented Jul 9, 2021

Looks great, thanks for working on it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a method to use secrets in QHub config
2 participants