-
Notifications
You must be signed in to change notification settings - Fork 973
Upgrade components of play ground to recent versions #7295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| openjdk-8-jdk-headless \ | ||
| openjdk-17-jdk-headless && \ | ||
| rm -rf /var/lib/apt/lists/* && \ | ||
| update-java-alternatives --set $(update-java-alternatives --list | grep java-1.8.0-openjdk | awk '{print $NF}') || \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use || to ignore the error code returned by update-java-alternatives command, as JDK 8 lacks some commands provided by mordern JDKs
| SPARK_BINARY_VERSION=3.4 | ||
| SPARK_VERSION=3.5.7 | ||
| SPARK_BINARY_VERSION=3.5 | ||
| SPARK_HADOOP_VERSION=3.3.4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spark 4 uses Hadoop client 3.4, which switches to AWS SDK 2.x, requires more work, so let's keep using Spark 3.5 for now. this also matches the current state of Kyuubi project - default Spakr version is 3.5
docker/playground/README.md
Outdated
|
|
||
| `docker exec -it kyuubi /opt/kyuubi/bin/beeline -u 'jdbc:hive2://0.0.0.0:10009/tpcds/tiny'`; | ||
| ``` | ||
| docker exec -it kyuubi /opt/kyuubi/bin/kyuubi-beeline -u 'jdbc:hive2://0.0.0.0:10009/tpcds/tiny' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we recommend using kyuubi-beeline instead of beeline, to distinguish from Hive/Spark's beeline
| ln -s /opt/hadoop-${HADOOP_VERSION} ${HADOOP_HOME} && \ | ||
| rm ${HADOOP_TAR_NAME}.tar.gz && \ | ||
| HADOOP_CLOUD_STORAGE_JAR_NAME=hadoop-cloud-storage && \ | ||
| wget -q ${MAVEN_MIRROR}/org/apache/hadoop/${HADOOP_CLOUD_STORAGE_JAR_NAME}/${HADOOP_VERSION}/${HADOOP_CLOUD_STORAGE_JAR_NAME}-${HADOOP_VERSION}.jar -P ${HADOOP_HOME}/share/hadoop/hdfs/lib && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hadoop-cloud-storage is a package for assembling, has no classes
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #7295 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 698 698
Lines 43636 43636
Branches 5893 5893
======================================
Misses 43636 43636 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| # limitations under the License. | ||
|
|
||
| FROM eclipse-temurin:8-focal | ||
| FROM ubuntu:focal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ubuntu 20.04 (Focal Fossa) is already end of standard support.
Let's upgrade the OS version in this PR or a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we should move forward.
one additional consideration, we'd better align it with hadoop dev container, otherwise there might be some issues when using hadoop native libs, especially when users play with security configs. e.g., ubuntu focal is the latest version that provides openssl 1.x, the hadoop native libs shipped by official release compile against ubuntu focal with openssl 1.x, a runtime linkage issue will be thrown if we try to enable kerberos on ubuntu jammy or noble.
but I think it has no issues for SIMPLE mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not related to this PR, another issue related to Hadoop and Ubuntu, the APT repo's jsvc is too old to support modern JDK, as Hadoop trunk is moving to JDK 17+, this could be another noisy for users to run kerberized Hadoop with JDK 17+ on Ubuntu, maybe we should contact Debian or Ubuntu Java team to upgrade it ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
APT repo's jsvc is too old
Found very old issue https://bugs.launchpad.net/ubuntu/+source/commons-daemon/+bug/1788154
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: Filed https://issues.apache.org/jira/browse/HADOOP-19774 to use Ubuntu 24.04 in Hadoop
|
thanks, merged to master. for ubuntu upgrading, I think hadoop is preparing to switch to ubuntu noble in the upcoming 3.5.0, let's wait for a little bit more time. |
Why are the changes needed?
Upgrade components of play ground to recent versions, in addition, Kyuubi and Spark switch to JDK 17, while other components like Hadoop and Hive, remamin using JDK 8.
How was this patch tested?
Tested locally by building images and run demo, the updated docker images will be available on DockerHub soon, reviewer can test it too.
Was this patch authored or co-authored using generative AI tooling?
No.