-
Notifications
You must be signed in to change notification settings - Fork 168
[MINOR] example: Add example docker compose Uniffle/Spark cluster #1532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit 10b0607.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1532 +/- ##
============================================
- Coverage 54.20% 54.19% -0.01%
- Complexity 2808 2835 +27
============================================
Files 430 436 +6
Lines 24410 24524 +114
Branches 2082 2075 -7
============================================
+ Hits 13231 13292 +61
- Misses 10349 10404 +55
+ Partials 830 828 -2 ☔ View full report in Codecov by Sentry. |
zuston
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for @EnricoMi proposing this, docker-compose deploy is easier than K8s.
But I'm think can we extend this scope from only CI test to provide one-stop uniffle cluster in a single machine? That means user can run external spark app(maybe on yarn) on this uniffle cluster. maybe some steps need to implement
- allow network open to everyone for uniffle on docker
- add example case to describe this
| shell: bash | ||
| - name: Prepare example Spark app | ||
| run: | | ||
| cat << EOL > example.scala |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code here looks messy, how about putting into the spark cluster dockerfile directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way the CI is self-contained, it does not depend on a file somewhere outside .github, and you can directly see and edit the code that it executes. This code is different to what is given in the README.md because it is a bit more complex.
deploy/docker-compose/README.md
Outdated
| ✔ Network rss_default Removed 0.4s | ||
| ``` | ||
|
|
||
| docker exec -it rss-spark-master-1 /opt/spark/bin/spark-shell \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot adding a new title? This paragraph looks strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
b9204d5 to
ef7a916
Compare
0b71763 to
d63699d
Compare
1c32be8 to
7147a06
Compare
a20d26a to
fd35ec2
Compare
e54eea6 to
1bf9aa0
Compare
|
CI is blocking @EnricoMi |
f2b9075 to
f0a6539
Compare
|
@zuston CI fixed, not pretty but stable. |
.github/workflows/docker.yml
Outdated
| run: | | ||
| docker compose -f deploy/docker/docker-compose.yml up --detach --scale shuffle-server=4 --scale spark-worker=5 | ||
|
|
||
| # for some reason, the uniffle containers terminate after first creation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know why this will happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have found the issue. The start.sh script gave the process 10 seconds to start. Starting 5 containers at once on the hardware spec used by GHA turns out to take more than those 10 seconds, in fact it takes over 20 seconds whereas on my local machine it takes less than 6.
I have reworked the start.sh script and the way the CI waits for the containers to come up.
cd07b78 to
911610a
Compare
911610a to
573d101
Compare
zuston
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @EnricoMi
|
@zuston Rust ci is so flaky. Could you fix it? |
Yes. I will |
What changes were proposed in this pull request?
Adds code to spin up an example Uniffle/Spark docker cluster using docker compose. This is used by the CI to test the example cluster setup.
Why are the changes needed?
This setup has a smaller footprint than the existing kubernetes example in
deploy/kubernetes/integration-test/e2e/README.md, which is not trivial to setup. The new example only requires Docker to be installed, and can be spun up viaThe Uniffle and Spark cluster can be used to interactively test Spark with Uniffle.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manually and CI tested.