Skip to content

Conversation

@sarutak
Copy link
Member

@sarutak sarutak commented Dec 6, 2021

What changes were proposed in this pull request?

This PR changes dev-run-integration-tests.sh to allow it to take a custom Dockerfile like #34790 did.
With this change, this script accepts --docker-file option, which takes a path to a custom Dockerfile.

$ ./dev/run-integration-tests.sh --docker-file /path/to/dockerfile

Why are the changes needed?

As of #34790, we can specify a custom Dockerfile by spark.kubernetes.test.dockerFile property when we run the K8s integration tests using Maven.
We can run the integration test via dev-run-integration-tests.sh but there is no way to specify a custom Dockerfile.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Confirmed that the K8s integration tests run with the following command using Dockerfile.java17.

cd resource-managers/kubernetes/integration-tests
./dev/dev-run-integration-tests.sh --docker-file ../docker/src/main/dockerfiles/spark/Dockerfile.java17

<td><code>spark-r</code></td>
</tr>
<tr>
<td><code>spark.kubernetes.test.dockerFile</code></td>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part is what I forgot to add in #34790 .

)

$TEST_ROOT_DIR/build/mvn install -f $TEST_ROOT_DIR/pom.xml -pl resource-managers/kubernetes/integration-tests $BUILD_DEPENDENCIES_MVN_FLAG -Pscala-$SCALA_VERSION -P$HADOOP_PROFILE -Pkubernetes -Pkubernetes-integration-tests ${properties[@]}
(cd $TEST_ROOT_DIR; ./build/mvn install -pl resource-managers/kubernetes/integration-tests $BUILD_DEPENDENCIES_MVN_FLAG -Pscala-$SCALA_VERSION -P$HADOOP_PROFILE -Pkubernetes -Pkubernetes-integration-tests ${properties[@]})
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that all the examples of dev-run-integration-tests.sh in integration-tests/README.md don't work because scala-style-config.xml is not found in integration-tests.
This change is to resolve the issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work? I'm testing your PR and hit the following.

$ cd resource-managers/kubernetes/integration-tests
$ ./dev/dev-run-integration-tests.sh --docker-file ../docker/src/main/dockerfiles/spark/Dockerfile.java17
++ git rev-parse --show-toplevel
+ TEST_ROOT_DIR=/Users/dongjoon/APACHE/spark-merge
+ DEPLOY_MODE=minikube
+ IMAGE_REPO=docker.io/kubespark
+ SPARK_TGZ=N/A
+ IMAGE_TAG=N/A
+ JAVA_IMAGE_TAG=
+ BASE_IMAGE_NAME=
+ JVM_IMAGE_NAME=
+ PYTHON_IMAGE_NAME=
+ R_IMAGE_NAME=
+ DOCKER_FILE=
+ SPARK_MASTER=
+ NAMESPACE=
+ SERVICE_ACCOUNT=
+ CONTEXT=
+ INCLUDE_TAGS=k8s
+ EXCLUDE_TAGS=
+ JAVA_VERSION=8
+ BUILD_DEPENDENCIES_MVN_FLAG=-am
+ HADOOP_PROFILE=hadoop-3.2
+ MVN=/Users/dongjoon/APACHE/spark-merge/build/mvn
++ /Users/dongjoon/APACHE/spark-merge/build/mvn help:evaluate -Dexpression=scala.binary.version
++ grep -v INFO
++ grep -v WARNING
++ tail -n 1
+ SCALA_VERSION=2.12
+ export SCALA_VERSION
+ echo 2.12
2.12
+ ((  2  ))
+ case $1 in
+ DOCKER_FILE=../docker/src/main/dockerfiles/spark/Dockerfile.java17
+ shift
+ shift
+ ((  0  ))
+ properties=(-Djava.version=$JAVA_VERSION -Dspark.kubernetes.test.sparkTgz=$SPARK_TGZ -Dspark.kubernetes.test.imageTag=$IMAGE_TAG -Dspark.kubernetes.test.imageRepo=$IMAGE_REPO -Dspark.kubernetes.test.deployMode=$DEPLOY_MODE -Dtest.include.tags=$INCLUDE_TAGS)
+ '[' -n '' ']'
+ '[' -n ../docker/src/main/dockerfiles/spark/Dockerfile.java17 ']'
+ properties=(${properties[@]} -Dspark.kubernetes.test.dockerFile=$(realpath $DOCKER_FILE))
++ realpath ../docker/src/main/dockerfiles/spark/Dockerfile.java17
./dev/dev-run-integration-tests.sh: line 153: realpath: command not found

Copy link
Member Author

@sarutak sarutak Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, realpath seems not to be a standard command. I tested on Pop!_OS 20.04 and CentOS 8.
I'll look for another way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python's os.path.realpath should work on both Linux and macOS.

Copy link
Member

@dongjoon-hyun dongjoon-hyun Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No~ It's Spark bash function in util.sh.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh..

Copy link
Member Author

@sarutak sarutak Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, Linux has realpath command (coreutils) but we have own realpath.

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Test build #145947 has finished for PR 34818 at commit 8072b13.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50421/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50421/

properties=(
${properties[@]}
-Dspark.kubernetes.test.dockerFile=$(
python3 -c "import os.path; print(os.path.realpath(\"$DOCKER_FILE\"))") )
Copy link
Member

@dongjoon-hyun dongjoon-hyun Dec 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarutak We should use realpath consistently in our Spark bash code.

It seems that the root cause was that

  1. You can a relative path via realpath in some directory
  2. And, we moved to another directory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't notice realpath in util.sh and I used realpath command which is installed on Linux.
OK, I'll use our own realpath here.

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Test build #145958 has finished for PR 34818 at commit 26eca41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50432/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50432/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Test build #145963 has finished for PR 34818 at commit 167b681.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50439/

@SparkQA
Copy link

SparkQA commented Dec 6, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50439/

@dongjoon-hyun dongjoon-hyun self-assigned this Dec 7, 2021
@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Test build #146007 has finished for PR 34818 at commit b599191.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class PandasSQLStringFormatter(string.Formatter):
  • class SQLStringFormatter(string.Formatter):
  • case class ConvertTimezone(

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50483/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50483/

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Thank you, @sarutak . It works as described.

@sarutak
Copy link
Member Author

sarutak commented Dec 8, 2021

@dongjoon-hyun Thank you for your advice and review !

@dongjoon-hyun dongjoon-hyun removed their assignment Apr 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants