Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,8 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
val connectionTimeout = timeout(5.minutes)
val keepContainer =
sys.props.getOrElse("spark.test.docker.keepContainer", "false").toBoolean
val removePulledImage =
sys.props.getOrElse("spark.test.docker.removePulledImage", "true").toBoolean
Copy link
Member

@dongjoon-hyun dongjoon-hyun May 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this causes image downloading every time, this is a regression for the non-GA users, isn't it?
I'd keep the default as false and enable this true only at GitHub Action environment if that is trying to resolve GA issue.

I investigated the reason and I found it's short of the storage capacity of the host on GA.

WDYT, @sarutak and @HyukjinKwon ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid behavior change, it would be better to keep it false but keeping container images in the local repository is side effect and done implicitly before this change. So I thought it is better do keep the images only if users can explicitly set the property to false.
Any thoughts? @dongjoon-hyun @HyukjinKwon

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. If you think so, never mind. Since it's already merged, let's leave this AS-IS, @sarutak .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay either way. Please feel free to change if you guys have a preference 👍


private var docker: DockerClient = _
// Configure networking (necessary for boot2docker / Docker Machine)
Expand All @@ -109,6 +111,7 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
port
}
private var containerId: String = _
private var pulled: Boolean = false
protected var jdbcUrl: String = _

override def beforeAll(): Unit = {
Expand All @@ -130,6 +133,7 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
case e: ImageNotFoundException =>
log.warn(s"Docker image ${db.imageName} not found; pulling image from registry")
docker.pull(db.imageName)
pulled = true
}
val hostConfigBuilder = HostConfig.builder()
.privileged(db.privileged)
Expand Down Expand Up @@ -215,6 +219,9 @@ abstract class DockerJDBCIntegrationSuite extends SharedSparkSession with Eventu
}
} finally {
docker.removeContainer(containerId)
if (removePulledImage && pulled) {
docker.removeImage(db.imageName)
}
}
}
}
Expand Down
2 changes: 2 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -264,6 +264,7 @@
<spark.test.home>${session.executionRootDirectory}</spark.test.home>
<spark.test.webdriver.chrome.driver></spark.test.webdriver.chrome.driver>
<spark.test.docker.keepContainer>false</spark.test.docker.keepContainer>
<spark.test.docker.removePulledImage>true</spark.test.docker.removePulledImage>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto.


<CodeCacheSize>1g</CodeCacheSize>
<!-- Needed for consistent times -->
Expand Down Expand Up @@ -2728,6 +2729,7 @@
<spark.unsafe.exceptionOnMemoryLeak>true</spark.unsafe.exceptionOnMemoryLeak>
<spark.test.webdriver.chrome.driver>${spark.test.webdriver.chrome.driver}</spark.test.webdriver.chrome.driver>
<spark.test.docker.keepContainer>${spark.test.docker.keepContainer}</spark.test.docker.keepContainer>
<spark.test.docker.removePulledImage>${spark.test.docker.removePulledImage}</spark.test.docker.removePulledImage>
<!-- Needed by sql/hive tests. -->
<test.src.tables>__not_used__</test.src.tables>
</systemProperties>
Expand Down