Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate whether using Helm directly (instead of using it via Terraform) would be beneficial #2441

Closed
3 tasks
marcelovilla opened this issue May 2, 2024 · 1 comment

Comments

@marcelovilla
Copy link
Member

marcelovilla commented May 2, 2024

Context

Currently, Terraform is being used not only to manage the required infrastructure for Nebari to run, but also manage different services (e.g., argo, grafana, jupyterhub, loki, etc...) via Helm. There are other services that are also managed with Terraform but not using the Helm provider (e.g., conda-store and dask-gateway).

While there has been a lot of work put into this approach, I believe having a complex multi-stage Terraform configuration might have the following downsides:

  • Dependencies between stages require passing variables to each other, adding complex logic behind the deployment/destruction process.
  • Mixing infrastructure with services configuration in Terraform does not allow for an easy decoupling of both components, hindering potential approaches where they should be running separately (e.g., redeploying services in an existing cluster to speed CI feedback times).
  • Keeping service configuration within Terraform forces resource destruction that adds overhead and might not be necessary when tearing down a Nebari cluster—after all the service configuration is not relevant anymore once the actual infrastructure gets destroyed.
  • As we move forward and consider incorporating specific use-case extensions (or spins), the current Terraform configuration will only grow more complex.

Value and/or benefit

Using Helm directly to deploy and configure services and having Terraform manage strictly the actual infrastructure for a Nebari cluster can simplify our current deployment/destruction process. At this point this is just a hypothesis, but I believe this would allow for a better developer experience and have a more sustainable code base in the long term.

Anything else?

There are a couple of tasks that can help us get a more informed opinion about this approach:

  • Review what services rely on Helm
  • Create a POC migrating one of those services to use Helm directly
  • Discuss or outline what the approach for other services that don't currently use Helm might look like (if different from just keeping the current Terraform configuration).
@dcmcand
Copy link
Contributor

dcmcand commented Aug 29, 2024

Closed since we have implemented this in PR #2609

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

2 participants