Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.9.2 #28601

Closed
wants to merge 3 commits into from
Closed

[SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.9.2 #28601

wants to merge 3 commits into from

Conversation

dongjoon-hyun
Copy link
Member

@dongjoon-hyun dongjoon-hyun commented May 21, 2020

What changes were proposed in this pull request?

This PR aims to upgrade kubernetes-client library to bring the JDK8 related fixes. Please note that JDK11 works fine without any problem.

Why are the changes needed?

OkHttp "wrongly" detects the Platform as Jdk9Platform on JDK 8u251.

Although there is a workaround export HTTP2_DISABLE=true and Downgrade JDK or K8s, we had better avoid this problematic situation.

Does this PR introduce any user-facing change?

No. This will recover the failures on JDK 8u252.

How was this patch tested?

v1.17.6 result (on Minikube)

KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 27 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.10.0 [SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.10.1 May 21, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.10.1 [SPARK-31786][K8S][BUILD] Upgrade kubernetes-client to 4.9.2 May 21, 2020
@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@dongjoon-hyun
Copy link
Member Author

Since there is a revert on master branch (db5e5fc), I'll rebase once more to make it sure.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@dongjoon-hyun
Copy link
Member Author

Retest this please.

@dongjoon-hyun
Copy link
Member Author

The one failure in the last K8s IT is a known flaky one.

KubernetesSuite:
- Run SparkPi with no resources *** FAILED ***
  The code passed to eventually never returned normally. Attempted 130 times over 2.0017359222166666 minutes. Last failure message: false was not true. (KubernetesSuite.scala:370)
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
- Run SparkR on simple dataframe.R example
Run completed in 14 minutes, 43 seconds.
Total number of tests run: 20
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 1, canceled 0, ignored 0, pending 0
*** 1 TEST FAILED ***

@SparkQA

This comment has been minimized.

@SparkQA

This comment has been minimized.

@SparkQA
Copy link

SparkQA commented May 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27597/

@dongjoon-hyun dongjoon-hyun marked this pull request as ready for review May 22, 2020 01:49
@SparkQA

This comment has been minimized.

@SparkQA
Copy link

SparkQA commented May 22, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27597/

@dongjoon-hyun
Copy link
Member Author

cc @dbtsai and @holdenk

@SparkQA
Copy link

SparkQA commented May 22, 2020

Test build #122952 has finished for PR 28601 at commit 4327940.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 22, 2020

WorkerDecommissionSuite seems to be a flaky test which is irrelevant to this PR.

org.apache.spark.scheduler.WorkerDecommissionSuite.verify a task with all workers decommissioned succeeds

@SparkQA
Copy link

SparkQA commented May 22, 2020

Test build #122956 has finished for PR 28601 at commit 4327940.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maver1ck
Copy link
Contributor

@dongjoon-hyun
Any reason of not using kubernetes-client 4.10.1?

@ScrapCodes
Copy link
Member

@dongjoon-hyun, I have the same question as above, why not upgrade to 4.10.1 ?

Looks good otherwise.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 22, 2020

Thank you for review, @maver1ck and @ScrapCodes .
In this PR, I tried 4.10.1 and 4.10.0 first. Please see the first commit (760030b) and the second commit (cdfc345). However, the dependency change was too huge. And, kubernetes-model was split into many jars in 4.10.x. Given the risk and complexity of 4.10.x, 4.9.2 is more safer option which having a minimal change for this issue. Since we need backport this PR to 3.0.0 RC3 and 2.4.6 RC4, minimizing risk is important. We can revisit 4.10.x later for the other issues.

@holdenk
Copy link
Contributor

holdenk commented May 22, 2020

Jenkins retest this please.
LGTM pending Jenkins

@dongjoon-hyun
Copy link
Member Author

Retest this please.

@SparkQA
Copy link

SparkQA commented May 22, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27660/

@SparkQA
Copy link

SparkQA commented May 22, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/27660/

@SparkQA
Copy link

SparkQA commented May 22, 2020

Test build #123017 has finished for PR 28601 at commit 4327940.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@viirya viirya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok and 4.9.2 seems less risky.

@srowen
Copy link
Member

srowen commented May 23, 2020

Yeah a minor update for now, esp if being backported, seems prudent.

@dongjoon-hyun
Copy link
Member Author

dongjoon-hyun commented May 23, 2020

Thank you all. Merged to master.

@dongjoon-hyun
Copy link
Member Author

Due to the conflicts on dependency file, I need to make a backporting PR separately.

BTW, branch-3.0 is currently broken by #28604.

The backporting PR will fail due the above failure.

dongjoon-hyun added a commit that referenced this pull request May 24, 2020
### What changes were proposed in this pull request?

This PR aims to upgrade `kubernetes-client` library to bring the JDK8 related fixes. Please note that JDK11 works fine without any problem.
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.9.2
  - JDK8 always uses http/1.1 protocol (Prevent OkHttp from wrongly enabling http/2)

### Why are the changes needed?

OkHttp "wrongly" detects the Platform as Jdk9Platform on JDK 8u251.
- fabric8io/kubernetes-client#2212
- https://stackoverflow.com/questions/61565751/why-am-i-not-able-to-run-sparkpi-example-on-a-kubernetes-k8s-cluster

Although there is a workaround `export HTTP2_DISABLE=true` and `Downgrade JDK or K8s`, we had better avoid this problematic situation.

### Does this PR introduce _any_ user-facing change?

No. This will recover the failures on JDK 8u252.

### How was this patch tested?

- [x] Pass the Jenkins UT (#28601 (comment))
- [x] Pass the Jenkins K8S IT with the K8s 1.13 (#28601 (comment))
- [x] Manual testing with K8s 1.17.3. (Below)

**v1.17.6 result (on Minikube)**
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 27 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes #28601 from dongjoon-hyun/SPARK-K8S-CLIENT.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
(cherry picked from commit 64ffc66)
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
@dongjoon-hyun
Copy link
Member Author

This is tested with the latest JDK8 and K8s 1.17.x manually and backported to branch-3.0.

@dongjoon-hyun dongjoon-hyun deleted the SPARK-K8S-CLIENT branch May 24, 2020 01:08
akirillov pushed a commit to d2iq-archive/spark that referenced this pull request Aug 19, 2020
### What changes were proposed in this pull request?

This PR aims to upgrade `kubernetes-client` library to bring the JDK8 related fixes. Please note that JDK11 works fine without any problem.
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.9.2
  - JDK8 always uses http/1.1 protocol (Prevent OkHttp from wrongly enabling http/2)

### Why are the changes needed?

OkHttp "wrongly" detects the Platform as Jdk9Platform on JDK 8u251.
- fabric8io/kubernetes-client#2212
- https://stackoverflow.com/questions/61565751/why-am-i-not-able-to-run-sparkpi-example-on-a-kubernetes-k8s-cluster

Although there is a workaround `export HTTP2_DISABLE=true` and `Downgrade JDK or K8s`, we had better avoid this problematic situation.

### Does this PR introduce _any_ user-facing change?

No. This will recover the failures on JDK 8u252.

### How was this patch tested?

- [x] Pass the Jenkins UT (apache#28601 (comment))
- [x] Pass the Jenkins K8S IT with the K8s 1.13 (apache#28601 (comment))
- [x] Manual testing with K8s 1.17.3. (Below)

**v1.17.6 result (on Minikube)**
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 27 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes apache#28601 from dongjoon-hyun/SPARK-K8S-CLIENT.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
laflechejonathan pushed a commit to laflechejonathan/spark that referenced this pull request Sep 25, 2020
This PR aims to upgrade `kubernetes-client` library to bring the JDK8 related fixes. Please note that JDK11 works fine without any problem.
- https://github.com/fabric8io/kubernetes-client/releases/tag/v4.9.2
  - JDK8 always uses http/1.1 protocol (Prevent OkHttp from wrongly enabling http/2)

OkHttp "wrongly" detects the Platform as Jdk9Platform on JDK 8u251.
- fabric8io/kubernetes-client#2212
- https://stackoverflow.com/questions/61565751/why-am-i-not-able-to-run-sparkpi-example-on-a-kubernetes-k8s-cluster

Although there is a workaround `export HTTP2_DISABLE=true` and `Downgrade JDK or K8s`, we had better avoid this problematic situation.

No. This will recover the failures on JDK 8u252.

- [x] Pass the Jenkins UT (apache#28601 (comment))
- [x] Pass the Jenkins K8S IT with the K8s 1.13 (apache#28601 (comment))
- [x] Manual testing with K8s 1.17.3. (Below)

**v1.17.6 result (on Minikube)**
```
KubernetesSuite:
- Run SparkPi with no resources
- Run SparkPi with a very long application name.
- Use SparkLauncher.NO_RESOURCE
- Run SparkPi with a master URL without a scheme.
- Run SparkPi with an argument.
- Run SparkPi with custom labels, annotations, and environment variables.
- All pods have the same service account by default
- Run extraJVMOptions check on driver
- Run SparkRemoteFileTest using a remote data file
- Run SparkPi with env and mount secrets.
- Run PySpark on simple pi.py example
- Run PySpark with Python2 to test a pyfiles example
- Run PySpark with Python3 to test a pyfiles example
- Run PySpark with memory customization
- Run in client mode.
- Start pod creation from template
- PVs with local storage
- Launcher client dependencies
- Test basic decommissioning
Run completed in 8 minutes, 27 seconds.
Total number of tests run: 19
Suites: completed 2, aborted 0
Tests: succeeded 19, failed 0, canceled 0, ignored 0, pending 0
All tests passed.
```

Closes apache#28601 from dongjoon-hyun/SPARK-K8S-CLIENT.

Authored-by: Dongjoon Hyun <dongjoon@apache.org>
Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants