-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: Draft docs for self-hosted (WIP) #30491
base: main
Are you sure you want to change the base?
docs: Draft docs for self-hosted (WIP) #30491
Conversation
Every single time I want to mark as Draft ... I fat finger to PR. drat. |
1. Set up the Materialize operator Helm repository. | ||
|
||
a. <red>TBD whether this is needed</red>. Add Helm to install charts that are | ||
hosted in the Materialize operator Helm repository: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if we'd be having people add materialize or we expect people to just have the repo so that they can just install.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right now the helm chart is just in the materialize repo... at some point soon we'll host it on a separate endpoint. for now I think we just tell people to download the materialize repo, check out a specific tag, and add the local path
This command removes all the Kubernetes components associated with the chart and | ||
deletes the release. | ||
|
||
## Deploying Materialize Environments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have no idea what is meant by "deploy a Materialize environment"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point -- we should scrub "environment" from the docs. that's what we use internally to describe an individual customer region within our cloud product, but is a pretty overloaded term
with the Helm chart / in public facing docs, I think we can just talk about "Deploying Materialize" and the "Materialize CR" (custom resource)
parameters: | ||
- parameter: clusterd.nodeSelector | ||
description: | | ||
<red>Replace with description content here.</red> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will need devs to:
- Specify which of these parameters should be user-facing.
- For those that are user facing, should update the descriptions.
feea0d6
to
d82a1c0
Compare
title: "Materialize Kubernetes Operator" | ||
description: "" | ||
|
||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also make this an overview page.
And move the content to an Install on AWS page (and possibly an Install locally on kind -- since that's how I'm running through the steps anyhow).
But, put the PR up quickly, so that at least you can get an idea about the content.
|
||
You can configure the Materialize operator chart. For example: | ||
|
||
- **RBAC** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below where I have the ## Parameters section, I could section off the parameters ... so that we don't need to "For example" here.
That is, in the parameters section, I could create subsections
## Parameters
### Network policies
### Observability
### RBAC
@@ -0,0 +1,242 @@ | |||
--- | |||
title: "Materialize Kubernetes Operator" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,61 @@ | |||
--- | |||
title: "Materialize Operator Configuration" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,21 @@ | |||
--- | |||
title: "Troubleshooting" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
This tutorial uses `kubectl`. To install, refer to the [`kubectl` documentationq](https://kubernetes.io/docs/tasks/tools/). | ||
|
||
### Kubernetes Storage Configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just noting that this is lacking context. we should have an explainer on why local storage is valuable (spilling to disk, operating on datasets larger than main memory, more graceful degradation rather than OOMing), and also note that this is optional (though highly recommended)
requestRollout: 22222222-2222-2222-2222-222222222222 | ||
forceRollout: 33333333-3333-3333-3333-333333333333 | ||
inPlaceRollout: false | ||
backendSecretName: materialize-backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is how Materialize gets the connection string to talk to its metadata database (postgres or crdb)... which makes me realize we have no mention of that whole set up here :)
noting that we'll need sections about setting up blob storage + metadata database
Thanks for getting this off the ground @kay-kim ! |
d82a1c0
to
fa0f19e
Compare
kubectl apply -f misc/helm-charts/testing/minio.yaml | ||
``` | ||
|
||
1. Install the following metrics service: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that the testing/readme.md says we need this, but this errors for me on my mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics-server should be optional... it's needed for some system table metrics to work, but not blocking to testing materialize
service/mzfhj38ptdjs-console NodePort 10.96.97.5 <none> 9000:30847/TCP | ||
``` | ||
|
||
1. Forward the Materialize console service to your local machine: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to forward on my mac.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this some more on the Cloud team, and likely will keep forwarding as the solution here, and generally let users decide what ingress strategy they want
fa0f19e
to
10a65be
Compare
name: materialize-backend | ||
namespace: materialize-environment | ||
stringData: | ||
metadata_backend_url: "postgres:// materialize_user:materialize_pass@postgres.materialize.svc.cluster. local:5432/materialize_db?sslmode=disable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure if we think we'd have people use postgres as the metadata db (at least in the beginning).
If not ... then, we can make this more general.
(ditto for the blob storage)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 this seems reasonable. we could add a comment that these params match if using the sample yamls
## Deploying Materialize | ||
|
||
### Set up the metadata database | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you want me to stub this with our testing yaml file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add generic content here tomorrow. But wanted to push up quickly the information architecture changes along with the self-hosted -> self-managed nomenclature changes.
|
||
|
||
### Set up blob storage | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you want me to stub this with our testing yaml file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine to point to our yaml files for setting up basic metadata db/blob storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohoh, now I understand this is on the generic page. hm... maybe we can add an explainer about what's needed for each of these (postgres or cockroach for metadata + s3-compatible blob storage)? and have the more specific install in kind
, install on aws
pages fill in with more detailed recommendations
apiVersion: v1 | ||
kind: Namespace | ||
metadata: | ||
name: materialize-environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we had the discussion about scrubbing "Environment" -- not sure if we actually want to do that with our namespace and such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote we do... IMO it's worth removing the term entirely. From the discussion earlier this week, we can just reference "Materialize" or the Materialize custom resource when possible (and omit "environment"), and "Materialize instance" when we must refer to the resources associated with a specific Materialize
CR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we already have a materialize
namespace for the operator, is there a preference for this?
--- | ||
title: "Appendix: Install locally on kind" | ||
description: "" | ||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiVersion: v1 | ||
kind: Namespace | ||
metadata: | ||
name: materialize-environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote we do... IMO it's worth removing the term entirely. From the discussion earlier this week, we can just reference "Materialize" or the Materialize custom resource when possible (and omit "environment"), and "Materialize instance" when we must refer to the resources associated with a specific Materialize
CR.
|
||
|
||
### Set up blob storage | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems fine to point to our yaml files for setting up basic metadata db/blob storage
name: materialize-backend | ||
namespace: materialize-environment | ||
stringData: | ||
metadata_backend_url: "postgres:// materialize_user:materialize_pass@postgres.materialize.svc.cluster. local:5432/materialize_db?sslmode=disable" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 this seems reasonable. we could add a comment that these params match if using the sample yamls
kind: InitConfiguration | ||
nodeRegistration: | ||
kubeletExtraArgs: | ||
node-labels: "materialize.cloud/disk=true,workload=materialize-instance" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these shouldn't be needed now
kubectl apply -f misc/helm-charts/testing/minio.yaml | ||
``` | ||
|
||
1. Install the following metrics service: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metrics-server should be optional... it's needed for some system table metrics to work, but not blocking to testing materialize
service/mzfhj38ptdjs-console NodePort 10.96.97.5 <none> 9000:30847/TCP | ||
``` | ||
|
||
1. Forward the Materialize console service to your local machine: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this some more on the Cloud team, and likely will keep forwarding as the solution here, and generally let users decide what ingress strategy they want
929ef2a
to
1571ceb
Compare
Draft. Just placing here to facilitate some specific questions as I work on this.
Added patch for general information architecture changes + updated self-hosted -> self-managed:
Since not part of the left-hand nav: