-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-35501][SQL][TESTS] Add a feature for removing pulled container image for docker integration tests #32652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #138872 has finished for PR 32652 at commit
|
pom.xml
Outdated
| <spark.test.home>${session.executionRootDirectory}</spark.test.home> | ||
| <spark.test.webdriver.chrome.driver></spark.test.webdriver.chrome.driver> | ||
| <spark.test.docker.keepContainer>false</spark.test.docker.keepContainer> | ||
| <spark.test.docker.removePulledImage>false</spark.test.docker.removePulledImage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should maybe just enable it by default. but I will leave it to you @sarutak and @dongjoon-hyun
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disabled it by default just for behavior compatibility (pulled images are not removed before this change).
But it may be no problem even if we enable it.
|
Thank you for pinging me, @sarutak . |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #138951 has finished for PR 32652 at commit
|
|
Merged to master. |
| val keepContainer = | ||
| sys.props.getOrElse("spark.test.docker.keepContainer", "false").toBoolean | ||
| val removePulledImage = | ||
| sys.props.getOrElse("spark.test.docker.removePulledImage", "true").toBoolean |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this causes image downloading every time, this is a regression for the non-GA users, isn't it?
I'd keep the default as false and enable this true only at GitHub Action environment if that is trying to resolve GA issue.
I investigated the reason and I found it's short of the storage capacity of the host on GA.
WDYT, @sarutak and @HyukjinKwon ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid behavior change, it would be better to keep it false but keeping container images in the local repository is side effect and done implicitly before this change. So I thought it is better do keep the images only if users can explicitly set the property to false.
Any thoughts? @dongjoon-hyun @HyukjinKwon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. If you think so, never mind. Since it's already merged, let's leave this AS-IS, @sarutak .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am okay either way. Please feel free to change if you guys have a preference 👍
| <spark.test.home>${session.executionRootDirectory}</spark.test.home> | ||
| <spark.test.webdriver.chrome.driver></spark.test.webdriver.chrome.driver> | ||
| <spark.test.docker.keepContainer>false</spark.test.docker.keepContainer> | ||
| <spark.test.docker.removePulledImage>true</spark.test.docker.removePulledImage> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto.
What changes were proposed in this pull request?
This PR adds a feature for removing pulled container image after every docker integration test finish.
This feature is enabled by the new propoerty
spark.tes.docker.removePulledImage.Why are the changes needed?
For idempotent.
I'm trying to add docker integration tests to GA in SPARK-35483 (#32631) but I noticed that
jdbc.OracleIntegrationSuiteconsistently fails(https://github.com/sarutak/spark/runs/2646707235?check_suite_focus=true).I investigated the reason and I found it's short of the storage capacity of the host on GA.
With this feature, pulled container image is removed and keep the capacity for
jdbc.OracleIntegrationSuitein GA.Does this PR introduce any user-facing change?
No.
How was this patch tested?
I confirmed the following things.
spark.test.container.removePulledImageistrue.spark.test.container.removePulledImageistrue.spark.test.container.removePulledImageistrue.