Skip to content

Conversation

@jiangzho
Copy link
Contributor

@jiangzho jiangzho commented Aug 12, 2024

What changes were proposed in this pull request?

This PR proposes Spark Kubernetes Operator Helm chart, which describes deployment manifest for the operator.

Why are the changes needed?

Helm chart helps users to install, upgrade and uninstall Spark Operator in Kubernetes cluster. It provides a cloud native way of managing Operator in place. This chart helps us to complete the end-to-end flow for the operator app.

Does this PR introduce any user-facing change?

No (not yet released)

How was this patch tested?

  • CI added for chart linting
  • Chart test for operator deployment RBAC validation
  • We are also working on SPARK-49214 to include the test automation in workflow. Till that is resolved, this patch can be validated manually by running the following steps:

Start minikube

Start miniKube and make it access locally-built image

minikube start
eval $(minikube docker-env)

Build Spark Operator Locally

# Build a local container image which can be used for minikube.etc. 
docker build --build-arg APP_VERSION=0.1.0 -t spark-kubernetes-operator:0.1.0 -f build-tools/docker/Dockerfile  .

# Generate CRD yaml and make it available for chart deployment
./gradlew spark-operator-api:relocateGeneratedCRD     

Install the Spark Operator

helm install spark-kubernetes-operator --create-namespace -f build-tools/helm/spark-kubernetes-operator/values.yaml build-tools/helm/spark-kubernetes-operator/ 

Verify the Installation

helm test spark-kubernetes-operator

Here's a sample spark app yaml

apiVersion: org.apache.spark/v1alpha1
kind: SparkApplication
metadata:
  name: spark-pi
spec:
  mainClass: "org.apache.spark.examples.SparkPi"
  jars: "local:///opt/spark/examples/jars/spark-examples_2.13-4.0.0-preview1.jar"
  sparkConf:
    spark.executor.instances: "5"
    spark.kubernetes.container.image: "spark:4.0.0-preview1"
    spark.kubernetes.authenticate.driver.serviceAccountName: "spark"
  applicationTolerations:
    resourceRetainPolicy: OnFailure
  runtimeVersions:
    scalaVersion: "2.13"
    sparkVersion: "4.0.0-preview1"

Create a file named spark-pi.yaml and run

kubectl create -f spark-pi.yaml

The operator should be able to operate given app from start to complete.

You can delete job with the following.

$ kubectl delete -f spark-pi.yaml
sparkapplication.org.apache.spark "spark-pi" deleted

Uninstallation

To remove the installed resources from your cluster, reset environment to the defaults and
shutdown the cluster:

helm uninstall spark-kubernetes-operator
eval $(minikube docker-env --unset)
minikube stop

Was this patch authored or co-authored using generative AI tooling?

No

@@ -0,0 +1,25 @@
################################################################################
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this because this repository doesn't use this style for ASF license.

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

@@ -0,0 +1,148 @@
################################################################################
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.

# https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
topologySpreadConstraints: [ ]
operatorContainer:
jvmArgs: "-XX:+UseG1GC -Xms3G -Xmx3G -Dfile.encoding=UTF8"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conflicts with memory 2Gi.

port: 19091
livenessProbe:
periodSeconds: 10
initialDelaySeconds: 30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason for this 30s? Could you add a comment why the default value of livenessProbe is insufficient? Or, need to be overridden?

periodSeconds: 10
initialDelaySeconds: 30
startupProbe:
failureThreshold: 30
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the same way, why the default value, 3s, is insufficient for this?

.*gradle
.*json
.helmignore
.*yml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you want to exclude here? This PR adds .yaml file instead of yml, doesn't it?

In addition, AFAIK, you added the license header already. Does this mean the added header (of this file) is incompatible with Apache RAT check?

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-48398] Add Helm chart for Operator Deployment [SPARK-48398] Add Helm Chart Aug 14, 2024
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added K8s integration test pipeline for you. Please revise this PR to run this helm chart on Minikube. nvm. Let's merge with the manual test and do this seperately.

@dongjoon-hyun
Copy link
Member

I revised this PR, @jiangzho and @viirya .

  • Use Apache Spark 4.0.0-preview1 as the example.
  • Use helm install instead of helm upgrade to avoid the initial failure.
  • Add Spark Job deletion example, kubectl delete -f spark-pi.yaml
  • Add --strict on helm linting.
  • Renaming during copy CRDs, rename '(.+).yml', '$1.yaml'.
  • Minor code clean up to focus on the main feature.
diff --git a/.github/.licenserc.yaml b/.github/.licenserc.yaml
index 6cafe63..5507e4d 100644
--- a/.github/.licenserc.yaml
+++ b/.github/.licenserc.yaml
@@ -19,6 +19,5 @@ header:
     - 'build/**'
     - '**/*.json'
     - '**/.helmignore'
-    - '**/*.yml'
 
   comment: on-failure
diff --git a/.github/workflows/build_and_test.yml b/.github/workflows/build_and_test.yml
index 2ca7bd2..fd99dfc 100644
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -46,7 +46,7 @@ jobs:
           ./gradlew build
       - name: Validate helm chart linting
         run: |
-          helm lint build-tools/helm/spark-kubernetes-operator
+          helm lint --strict build-tools/helm/spark-kubernetes-operator
   build-image:
     name: "Build Operator Image CI"
     runs-on: ubuntu-latest
diff --git a/.gitignore b/.gitignore
index d5afef7..b174a96 100644
--- a/.gitignore
+++ b/.gitignore
@@ -6,6 +6,7 @@
 .vscode
 /lib/
 target/
+build-tools/helm/spark-kubernetes-operator/crds
 
 # Gradle Files #
 ################
@@ -16,7 +17,3 @@ build
 dependencies.lock
 **/dependencies.lock
 gradle/wrapper/gradle-wrapper.jar
-
-# Generated Files #
-###################
-build-tools/helm/spark-kubernetes-operator/crds
diff --git a/build-tools/helm/spark-kubernetes-operator/.helmignore b/build-tools/helm/spark-kubernetes-operator/.helmignore
index 0e8a0eb..5920392 100644
--- a/build-tools/helm/spark-kubernetes-operator/.helmignore
+++ b/build-tools/helm/spark-kubernetes-operator/.helmignore
@@ -1,15 +1,4 @@
-# Patterns to ignore when building packages.
-# This supports shell glob matching, relative path matching, and
-# negation (prefixed with !). Only one pattern per line.
 .DS_Store
-# Common VCS dirs
-.git/
-.gitignore
-.bzr/
-.bzrignore
-.hg/
-.hgignore
-.svn/
 # Common backup files
 *.swp
 *.bak
diff --git a/build-tools/helm/spark-kubernetes-operator/Chart.yaml b/build-tools/helm/spark-kubernetes-operator/Chart.yaml
index 46e612a..deed565 100644
--- a/build-tools/helm/spark-kubernetes-operator/Chart.yaml
+++ b/build-tools/helm/spark-kubernetes-operator/Chart.yaml
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,9 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
-
----
 apiVersion: v2
 name: spark-kubernetes-operator
 description: A Helm chart for the Apache Spark Kubernetes Operator
diff --git a/build-tools/helm/spark-kubernetes-operator/conf/log4j2.properties b/build-tools/helm/spark-kubernetes-operator/conf/log4j2.properties
deleted file mode 100644
index 3a0256b..0000000
--- a/build-tools/helm/spark-kubernetes-operator/conf/log4j2.properties
+++ /dev/null
@@ -1,51 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-status=info
-strict=true
-dest=out
-name=PropertiesConfig
-property.filename=/opt/spark-operator/logs/spark-operator
-filter.threshold.type=ThresholdFilter
-filter.threshold.level=debug
-# console
-appender.console.type=Console
-appender.console.name=STDOUT
-appender.console.layout.type=PatternLayout
-appender.console.layout.pattern=%d %p %X %C{1.} [%t] %m%n
-appender.console.filter.threshold.type=ThresholdFilter
-appender.console.filter.threshold.level=info
-# rolling JSON
-appender.rolling.type=RollingFile
-appender.rolling.name=RollingFile
-appender.rolling.append=true
-appender.rolling.fileName=${filename}.log
-appender.rolling.filePattern=${filename}-%i.log.gz
-appender.rolling.layout.type=JsonTemplateLayout
-appender.rolling.layout.eventTemplateUri=classpath:EcsLayout.json
-appender.rolling.policies.type=Policies
-appender.rolling.policies.size.type=SizeBasedTriggeringPolicy
-appender.rolling.policies.size.size=100MB
-appender.rolling.strategy.type=DefaultRolloverStrategy
-appender.rolling.strategy.max=20
-appender.rolling.immediateFlush=true
-# chatty loggers
-rootLogger.level=all
-logger.netty.name=io.netty
-logger.netty.level=warn
-log4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector
-rootLogger.appenderRef.stdout.ref=STDOUT
-rootLogger.appenderRef.rolling.ref=RollingFile
diff --git a/build-tools/helm/spark-kubernetes-operator/conf/spark-operator.properties b/build-tools/helm/spark-kubernetes-operator/conf/spark-operator.properties
deleted file mode 100644
index e644970..0000000
--- a/build-tools/helm/spark-kubernetes-operator/conf/spark-operator.properties
+++ /dev/null
@@ -1,19 +0,0 @@
-#
-# Licensed to the Apache Software Foundation (ASF) under one or more
-# contributor license agreements.  See the NOTICE file distributed with
-# this work for additional information regarding copyright ownership.
-# The ASF licenses this file to You under the Apache License, Version 2.0
-# (the "License"); you may not use this file except in compliance with
-# the License.  You may obtain a copy of the License at
-#
-#     http://www.apache.org/licenses/LICENSE-2.0
-#
-# Unless required by applicable law or agreed to in writing, software
-# distributed under the License is distributed on an "AS IS" BASIS,
-# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-# See the License for the specific language governing permissions and
-# limitations under the License.
-#
-
-# Property Overrides. e.g.
-# spark.kubernetes.operator.reconciler.intervalSeconds=60
diff --git a/build-tools/helm/spark-kubernetes-operator/templates/_helpers.tpl b/build-tools/helm/spark-kubernetes-operator/templates/_helpers.tpl
index 526b778..e8ee901 100644
--- a/build-tools/helm/spark-kubernetes-operator/templates/_helpers.tpl
+++ b/build-tools/helm/spark-kubernetes-operator/templates/_helpers.tpl
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,7 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
 
 {{/*
 Expand the name of the chart.
diff --git a/build-tools/helm/spark-kubernetes-operator/templates/sparkapps-resource.yaml b/build-tools/helm/spark-kubernetes-operator/templates/app-rbac.yaml
similarity index 96%
rename from build-tools/helm/spark-kubernetes-operator/templates/sparkapps-resource.yaml
rename to build-tools/helm/spark-kubernetes-operator/templates/app-rbac.yaml
index 613e489..f76d56b 100644
--- a/build-tools/helm/spark-kubernetes-operator/templates/sparkapps-resource.yaml
+++ b/build-tools/helm/spark-kubernetes-operator/templates/app-rbac.yaml
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,7 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
 
 {{/*
 RBAC rules used to create the app (cluster)role based on the scope
diff --git a/build-tools/helm/spark-kubernetes-operator/templates/rbac.yaml b/build-tools/helm/spark-kubernetes-operator/templates/operator-rbac.yaml
similarity index 95%
rename from build-tools/helm/spark-kubernetes-operator/templates/rbac.yaml
rename to build-tools/helm/spark-kubernetes-operator/templates/operator-rbac.yaml
index e03809e..f8364e6 100644
--- a/build-tools/helm/spark-kubernetes-operator/templates/rbac.yaml
+++ b/build-tools/helm/spark-kubernetes-operator/templates/operator-rbac.yaml
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,7 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
 
 {{/*
 RBAC rules used to create the operator (cluster)role
@@ -61,7 +59,7 @@ Labels and annotations to be applied on rbacResources
 {{- end }}
 
 ---
-#Service account and rolebindings for operator
+# Service account and rolebindings for operator
 apiVersion: v1
 kind: ServiceAccount
 metadata:
diff --git a/build-tools/helm/spark-kubernetes-operator/templates/spark-operator.yaml b/build-tools/helm/spark-kubernetes-operator/templates/spark-operator.yaml
index 555b239..5edd228 100644
--- a/build-tools/helm/spark-kubernetes-operator/templates/spark-operator.yaml
+++ b/build-tools/helm/spark-kubernetes-operator/templates/spark-operator.yaml
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,7 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
 
 apiVersion: apps/v1
 kind: Deployment
@@ -25,7 +23,6 @@ metadata:
     {{- include "spark-operator.commonLabels" . | nindent 4 }}
 spec:
   replicas: {{ .Values.operatorDeployment.replicas }}
-  revisionHistoryLimit: 2
   strategy:
     {{- toYaml .Values.operatorDeployment.strategy | nindent 4 }}
   selector:
diff --git a/build-tools/helm/spark-kubernetes-operator/templates/tests/test-rbac.yaml b/build-tools/helm/spark-kubernetes-operator/templates/tests/test-rbac.yaml
index 03fc4b1..33d8fa3 100644
--- a/build-tools/helm/spark-kubernetes-operator/templates/tests/test-rbac.yaml
+++ b/build-tools/helm/spark-kubernetes-operator/templates/tests/test-rbac.yaml
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,8 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
-
 apiVersion: v1
 kind: Pod
 metadata:
diff --git a/build-tools/helm/spark-kubernetes-operator/values.yaml b/build-tools/helm/spark-kubernetes-operator/values.yaml
index 16ecc5e..87fda1e 100644
--- a/build-tools/helm/spark-kubernetes-operator/values.yaml
+++ b/build-tools/helm/spark-kubernetes-operator/values.yaml
@@ -1,4 +1,3 @@
-################################################################################
 # Licensed to the Apache Software Foundation (ASF) under one or more
 # contributor license agreements.  See the NOTICE file distributed with
 # this work for additional information regarding copyright ownership.
@@ -13,7 +12,6 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
-################################################################################
 
 image:
   repository: spark-kubernetes-operator
@@ -43,7 +41,7 @@ operatorDeployment:
     # https://kubernetes.io/docs/concepts/scheduling-eviction/topology-spread-constraints/
     topologySpreadConstraints: [ ]
     operatorContainer:
-      jvmArgs: "-XX:+UseG1GC -Xms3G -Xmx3G -Dfile.encoding=UTF8"
+      jvmArgs: "-Dfile.encoding=UTF8"
       env:
       envFrom:
       volumeMounts: { }
@@ -76,15 +74,7 @@ operatorDeployment:
         seccompProfile:
           type: RuntimeDefault
     additionalContainers: { }
-    # additionalContainers:
-    #  - name: ""
-    #    image: ""
     volumes: { }
-    # volumes:
-    #   - name: spark-artifacts
-    #     hostPath:
-    #       path: /tmp/spark/artifacts
-    #       type: DirectoryOrCreate
     securityContext: { }
     dnsPolicy:
     dnsConfig:
@@ -191,5 +181,5 @@ operatorConfiguration:
     annotations:
       "helm.sh/resource-policy": keep
     data:
-    # Spark Operator Config Runtime Properties Overrides. e.g.
+      # Spark Operator Config Runtime Properties Overrides. e.g.
       spark.kubernetes.operator.reconciler.intervalSeconds: 60
diff --git a/dev/.rat-excludes b/dev/.rat-excludes
index a556cb7..4234634 100644
--- a/dev/.rat-excludes
+++ b/dev/.rat-excludes
@@ -16,4 +16,3 @@ build
 .*gradle
 .*json
 .helmignore
-.*yml
diff --git a/spark-operator-api/build.gradle b/spark-operator-api/build.gradle
index 13b80e4..8b1dfc3 100644
--- a/spark-operator-api/build.gradle
+++ b/spark-operator-api/build.gradle
@@ -38,4 +38,5 @@ tasks.register('relocateGeneratedCRD', Copy) {
   dependsOn finalizeGeneratedCRD
   from "build/classes/java/main/META-INF/fabric8/sparkapplications.org.apache.spark-v1.yml"
   into "../build-tools/helm/spark-kubernetes-operator/crds"
+  rename '(.+).yml', '$1.yaml'
 }

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants