add chart (#1)

Signed-off-by: eliyahunoach77 <root@linux-eliyahun.iguaz.io> Signed-off-by: eliyahunoach77 <root@linux-eliyahun.iguaz.io> Co-authored-by: eliyahunoach77 <root@linux-eliyahun.iguaz.io>
mlrun · Aug 16, 2022 · 4de89b2 · 4de89b2
1 parent ce5fa9c
commit 4de89b2
Show file tree

Hide file tree

Showing 82 changed files with 4,401 additions and 0 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,9 @@
+# Helm Charts
+Helm Charts for V3IO Components
+
+# Usage
+
+```
+$ helm repo add mlrunce-stable https://mlrun.github.io/ce/helm-charts/stable
+
+```
diff --git a/charts/mlrun-ce/Chart.yaml b/charts/mlrun-ce/Chart.yaml
@@ -0,0 +1,12 @@
+apiVersion: v1
+version: 0.0.1
+name: mlrun-ce
+description: MLRUn Open Source Stack
+home: https://iguazio.com
+icon: https://www.iguazio.com/wp-content/uploads/2019/10/Iguazio-Logo.png
+sources: []
+maintainers:
+  - name: Adam Melnick
+    email: adamm@iguazio.com
+  - name: Eliyahu Noach
+    email: eliyahun@iguazio.com
diff --git a/charts/mlrun-ce/README.md b/charts/mlrun-ce/README.md
@@ -0,0 +1,160 @@
+# MLRun CE: MLRun Open Source CE for MLOps
+
+This Helm charts bundles open source software stack for advanced ML operations
+
+## Chart Details
+
+The Open source MLRun ce chart includes the following stack:
+
+* Nuclio - https://github.com/nuclio/nuclio
+* MLRun - https://github.com/mlrun/mlrun
+* Jupyter - https://github.com/jupyter/notebook (+MLRun integrated)
+* MPI Operator - https://github.com/kubeflow/mpi-operator
+* Minio - https://github.com/minio/minio/tree/master/helm/minio
+* Spark Operator - https://github.com/GoogleCloudPlatform/spark-on-k8s-operator
+* Pipelines - https://github.com/kubeflow/pipelines
+* Prometheus stack - https://github.com/prometheus-community/helm-charts
+
+## Prerequisites
+
+- Helm >=3.6 installed from [here](https://helm.sh/docs/intro/install/)
+
+- Preprovisioned Kubernetes StorageClass
+
+> In case your Kubernetes flavor is not shipped with a default StorageClass, you may use [local-path by Rancher](https://github.com/rancher/local-path-provisioner)
+> 1. Install it via [this link](https://github.com/rancher/local-path-provisioner#installation)  
+> 2. Set as default by executing `kubectl patch storageclass local-path -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'`
+
+
+## Installing the Chart
+
+Create a namespace for the deployed components:
+```bash
+kubectl create namespace mlrun
+```
+
+Add the v3io-stable helm chart repo
+```bash
+helm repo add v3io-stable https://v3io.github.io/helm-charts/stable
+```
+
+To work with the open source MLRun stack, you must an accessible docker-registry. The registry's URL and credentials
+are consumed by the applications via a pre-created secret
+
+To create a secret with your docker-registry details:
+
+```bash
+kubectl --namespace mlrun create secret docker-registry registry-credentials \
+    --docker-username <registry-username> \
+    --docker-password <login-password> \
+    --docker-server <server URL, e.g. https://index.docker.io/v1/ > \
+    --docker-email <user-email>
+```
+
+To install the chart with the release name `my-mlrun` use the following command, 
+note the reference to the pre-created `registry-credentials` secret in `global.registry.secretName`, 
+and a `global.registry.url` with an appropriate registry URL which can be authenticated by this secret:
+
+```bash
+helm --namespace mlrun \
+    install my-mlrun \
+    --wait \
+    --set global.registry.url=<registry URL e.g. index.docker.io/iguazio > \
+    --set global.registry.secretName=registry-credentials \
+    v3io-stable/mlrun-ce
+```
+
+## Installing MLRun-ce on minikube
+
+The Open source MLRun ce uses node ports for simplicity. If your kubernetes cluster is running inside a VM, 
+as is the case when using minikube, the kubernetes services exposed over node ports would not be available on 
+your local interface, but instead, on the virtual machine's interface.
+To accommodate for this, use the `global.externalHostAddress` value on the chart. For example, if you're using 
+the ce inside a minikube cluster, add `--set global.externalHostAddress=$(minikube ip)` to the helm install command.
+
+## Advanced Chart Configuration
+
+Configurable values are documented in the `values.yaml`, and the `values.yaml` of all sub charts. 
+Override those [in the normal methods](https://helm.sh/docs/chart_template_guide/values_files/).
+
+To use the full version, override the helm install command using `-f override-full.yaml`
+
+
+### Usage
+
+Your applications are now available in your local browser:
+- jupyter-notebook - http://nodeipaddress:30040
+- nuclio - http://nodeipaddress:30050
+- mlrun UI - http://nodeipaddress:30060
+- mlrun API (external) - http://nodeipaddress:30070
+- minio API - http://nodeipaddress:30080
+- minio UI - http://nodeipaddress:30090
+- pipeline UI - http://nodeipaddress:30100
+- grafana UI - http://nodeipaddress:30110
+
+
+> **Note:**
+> The above links assume your Kubernetes cluster is exposed on localhost.
+> If that's not the case, the different components will be available on `externalHostAddress`
+
+### Start Working
+
+- Open Jupyter Notebook on [**jupyter-notebook UI**](http://localhost:30040) and run the code in 
+[**examples/mlrun_basics.ipynb**](https://github.com/mlrun/mlrun/blob/master/examples/mlrun_basics.ipynb) notebook.
+
+> **Note:**
+> - You can change the ports by providing values to the helm install command.
+> - You can add and configure a k8s ingress-controller for better security and control over external access.
+
+
+## Uninstalling the Chart
+```bash
+helm --namespace mlrun uninstall my-mlrun
+```
+
+#### Note on terminating pods and hanging resources
+It is important to note that this chart generates several persistent volume claims and also provisions an NFS
+provisioning server, to provide the user with persistency (via pvc) out of the box.
+Because of the persistency of PV/PVC resources, after installing this chart, PVs and PVCs will be created,
+And upon uninstallation, any hanging / terminating pods will hold the PVCs and PVs respectively, as those
+Prevent their safe removal.
+Because pods stuck in terminating state seem to be a never-ending plague in k8s, please note this,
+And don't forget to clean the remaining PVCs and PVs
+
+Handing stuck-at-terminating pods:
+```bash
+kubectl --namespace mlrun delete pod --force --grace-period=0 <pod-name>
+```
+
+Reclaim dangling persistency resources:
+
+| WARNING: This will result in data loss! |
+| --- |
+
+```bash
+# To list PVCs
+$ kubectl --namespace mlrun get pvc
+...
+
+# To remove a PVC
+$ kubectl --namespace mlrun delete pvc <pvc-name>
+...
+
+# To list PVs
+$ kubectl --namespace mlrun get pv
+...
+
+# To remove a PVC
+$ kubectl --namespace mlrun delete pvc <pv-name>
+
+# Remove hostpath(s) used for mlrun (and possibly nfs). Those will be created, by default under /tmp, and will contain
+# your release name, e.g.:
+$ rm -rf my-mlrun-mlrun-ce-mlrun
+...
+```
+
+### Using Kubeflow Pipelines
+
+MLRun enables you to run your functions while saving outputs and artifacts in a way that is visible to Kubeflow Pipelines.
+If you wish to use this capability you will need to install Kubeflow on your cluster.
+Refer to the [**Kubeflow documentation**](https://www.kubeflow.org/docs/started/getting-started/) for more information.
diff --git a/charts/mlrun-ce/override-full.yaml b/charts/mlrun-ce/override-full.yaml
@@ -0,0 +1,154 @@
+global:
+
+  # External host/ip to reach the k8s node. This might take various values if k8s is run in a VM or a cloud env
+  externalHostAddress: localhost
+  registry:
+    url: mustprovide
+    secretName: secretNameofcontainerregistrymustprovide
+
+mlrun:
+  # set the type of filesystem to use: filesystem, s3
+  storage: filesystem
+  api:
+    fullnameOverride: mlrun-api
+    persistence:
+      enabled: true
+      annotations:
+        helm.sh/resource-policy: "keep"
+    extraEnv:
+      - name: MLRUN_SPARK_OPERATOR_VERSION
+        value: spark-3
+      - name: MLRUN_STORAGE__AUTO_MOUNT_TYPE
+        value: s3
+      - name: MLRUN_STORAGE__AUTO_MOUNT_PARAMS
+        value: "aws_access_key=minio,aws_secret_key=minio123,endpoint_url=http://minio.mlrun.svc.cluster.local:9000"
+      - name: MLRUN_HTTPDB__PROJECTS__FOLLOWERS
+        value: nuclio
+      - name: S3_ENDPOINT_URL
+        value: http://minio.mlrun.svc.cluster.local:9000
+      - name: AWS_SECRET_ACCESS_KEY
+        value: minio123
+      - name: AWS_ACCESS_KEY_ID
+        value: minio
+      - name: MLRUN_HTTPDB__REAL_PATH
+        value: s3://
+      - name: MLRUN_ARTIFACT_PATH
+        value: s3://mlrun/
+      - name: MLRUN_SPARK_APP_IMAGE
+        value: gcr.io/iguazio/spark-app
+      - name: MLRUN_SPARK_APP_IMAGE_TAG
+        value: v3.2.1-mlk
+      - name: MLRUN_KFP_URL
+        value: http://ml-pipeline.mlrun.svc.cluster.local:8888
+  db:
+    persistence:
+      enabled: true
+      annotations:
+        helm.sh/resource-policy: "keep"
+
+jupyterNotebook:
+  persistence:
+    enabled: true
+    annotations:
+      helm.sh/resource-policy: "keep"
+
+minio:
+  enabled: true
+  rootUser: minio
+  rootPassword: minio123
+  mode: distributed
+  replicas: 4
+  resources:
+    requests:
+      memory: 0.5Gi
+  persistence:
+    enabled: true
+    size: 1Gi
+
+spark-operator:
+  enabled: true
+  fullnameOverride: spark-operator
+  webhook:
+     enable: true
+
+pipelines:
+  enabled: true
+  name: pipelines
+  persistence:
+    enabled: true
+    existingClaim:
+    storageClass:
+    accessMode: "ReadWriteOnce"
+    size: "20Gi"
+    annotations:
+      helm.sh/resource-policy: "keep"
+  db:
+    username: root
+  minio:
+    enabled: true
+    accessKey: "minio"
+    secretKey: "minio123"
+    endpoint: "minio.mlrun.svc.cluster.local"
+    endpointPort: "9000"
+    bucket: "mlrun"
+  images:
+    argoexec:
+      repository: gcr.io/ml-pipeline/argoexec
+      tag: v3.3.8-license-compliance
+    workflowController:
+      repository: gcr.io/ml-pipeline/workflow-controller
+      tag: v3.3.8-license-compliance
+    apiServer:
+      repository: gcr.io/ml-pipeline/api-server
+      tag: 1.8.3
+    persistenceagent:
+      repository: gcr.io/ml-pipeline/persistenceagent
+      tag: 1.8.3
+    scheduledworkflow:
+      repository: gcr.io/ml-pipeline/scheduledworkflow
+      tag: 1.8.3
+    ui:
+      repository: gcr.io/ml-pipeline/frontend
+      tag: 1.8.3
+    viewerCrdController:
+      repository: gcr.io/ml-pipeline/viewer-crd-controller
+      tag: 1.8.3
+    visualizationServer:
+      repository: gcr.io/ml-pipeline/visualization-server
+      tag: 1.8.3
+    metadata:
+      container:
+        repository: gcr.io/tfx-oss-public/ml_metadata_store_server
+        tag: 1.5.0
+    metadataEnvoy:
+      repository: gcr.io/ml-pipeline/metadata-envoy
+      tag: 1.8.3
+    metadataWriter:
+      repository: gcr.io/ml-pipeline/metadata-writer
+      tag: 1.8.3
+    mysql:
+      repository: mysql
+      tag: 5.7-debian
+    cacheImage:
+      repository: gcr.io/google-containers/busybox
+      tag: latest
+
+kube-prometheus-stack:
+  fullnameOverride: monitoring
+  enabled: true
+  alertmanager:
+    enabled: false
+  grafana:
+    adminUser: "admin"
+    adminPassword: "admin-passw123"
+    fullnameOverride: grafana
+    enabled: true
+    service:
+      type: NodePort
+      nodePort: 30110
+  prometheus:
+    enabled: true
+  kube-state-metrics:
+    fullnameOverride: state-metrics
+  prometheus-node-exporter:
+    fullnameOverride: node-exporter
diff --git a/charts/mlrun-ce/requirements.lock b/charts/mlrun-ce/requirements.lock
@@ -0,0 +1,21 @@
+dependencies:
+- name: nuclio
+  repository: https://nuclio.github.io/nuclio/charts
+  version: 0.14.0
+- name: mlrun
+  repository: https://v3io.github.io/helm-charts/stable
+  version: 0.9.1
+- name: mpi-operator
+  repository: https://v3io.github.io/helm-charts/stable
+  version: 0.6.0
+- name: minio
+  repository: https://charts.min.io/
+  version: 4.0.2
+- name: spark-operator
+  repository: https://googlecloudplatform.github.io/spark-on-k8s-operator
+  version: 1.1.25
+- name: kube-prometheus-stack
+  repository: https://prometheus-community.github.io/helm-charts
+  version: 39.6.0
+digest: sha256:1f19304db4f4a2e772fb7401e33ac98aea8f93a2b6c85d788a538af9706dda92
+generated: "2022-08-14T11:48:43.2664916+03:00"
diff --git a/charts/mlrun-ce/requirements.yaml b/charts/mlrun-ce/requirements.yaml
@@ -0,0 +1,23 @@
+dependencies:
+- name: nuclio
+  version: "0.14.0"
+  repository: "https://nuclio.github.io/nuclio/charts"
+- name: mlrun
+  version: "0.9.1"
+  repository: "https://v3io.github.io/helm-charts/stable"
+- name: mpi-operator
+  version: "0.6.0"
+  repository: "https://v3io.github.io/helm-charts/stable"
+- name: minio
+  repository: "https://charts.min.io/"
+  version: "4.0.2"
+  condition: minio.enabled
+- name: spark-operator
+  repository: "https://googlecloudplatform.github.io/spark-on-k8s-operator"
+  version: "1.1.25"
+  condition: spark-operator.enabled
+- name: kube-prometheus-stack
+  repository: "https://prometheus-community.github.io/helm-charts"
+  version: "39.6.0"
+  condition: kube-prometheus-stack.enabled
+