HADOOP-17195. ABFS Store thread pool for stream IO. #2294

steveloughran · 2020-09-09T16:19:24Z

This is the successor to #2179

ABFS Store creates a single threadpool, configurable with fixed size or multiple of cores
each output stream is given its own semaphored pool which limits the access that stream has to the pool

To actually defend against OOMs the per-stream queue length is what needs to be managed; looking at the patch it still has the problem of #2179: you need one buffer per pending upload in the the pools.

Ultimately the S3A Connector fixed this by going to disk buffering by default. A more performant design might be to have a blocking byte buffer factory which limits the #of buffers which the streams can request, so putting an upper bound on the amount of memory which a single ABFS store instance can demand.

Change-Id: I6915539cfafe7164c404dfc153653710280d9bf6

hadoop-yetus · 2020-09-09T18:41:58Z

🎊 +1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	1m 39s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
+1 💚	test4tests	0m 0s	The patch appears to include 1 new or modified test files.
		_ trunk Compile Tests _
+1 💚	mvninstall	36m 34s	trunk passed
+1 💚	compile	0m 49s	trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚	compile	0m 33s	trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚	checkstyle	0m 26s	trunk passed
+1 💚	mvnsite	0m 40s	trunk passed
+1 💚	shadedclient	19m 51s	branch has no errors when building and testing our client artifacts.
+1 💚	javadoc	0m 32s	trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚	javadoc	0m 27s	trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+0 🆗	spotbugs	1m 13s	Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚	findbugs	1m 9s	trunk passed
		_ Patch Compile Tests _
+1 💚	mvninstall	0m 36s	the patch passed
+1 💚	compile	0m 37s	the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚	javac	0m 37s	the patch passed
+1 💚	compile	0m 31s	the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚	javac	0m 31s	the patch passed
-0 ⚠️	checkstyle	0m 19s	hadoop-tools/hadoop-azure: The patch generated 7 new + 2 unchanged - 0 fixed = 9 total (was 2)
+1 💚	mvnsite	0m 34s	the patch passed
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
+1 💚	shadedclient	18m 28s	patch has no errors when building and testing our client artifacts.
+1 💚	javadoc	0m 27s	the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1
+1 💚	javadoc	0m 24s	the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
+1 💚	findbugs	1m 15s	the patch passed
		_ Other Tests _
+1 💚	unit	1m 38s	hadoop-azure in the patch passed.
+1 💚	asflicense	0m 34s	The patch does not generate ASF License warnings.
		90m 15s

Subsystem	Report/Notes
Docker	ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2294/1/artifact/out/Dockerfile
GITHUB PR	#2294
Optional Tests	dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname	Linux 70cc2f756b0c 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/bin/hadoop.sh
git revision	trunk / `e5fe326`
Default Java	Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
Multi-JDK versions	/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01
checkstyle	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2294/1/artifact/out/diff-checkstyle-hadoop-tools_hadoop-azure.txt
Test Results	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2294/1/testReport/
Max. process+thread count	309 (vs. ulimit of 5500)
modules	C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure
Console output	https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2294/1/console
versions	git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by	Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

steveloughran · 2020-09-10T11:21:53Z

Looking at this a bit more

its the use of buffer which causes the OOM not the thread pooling, so neither this nor its predecessor patch will directly fix that
need to support a bytebuffer pool with max capacity and/or disk buffering

steveloughran · 2020-09-16T13:21:20Z

Closing this, but leaving up as the PoC to say "we should have a shared thread pool for lower startup costs"; it would be a switch to buffering on which will the way to guarantee an end to OOM problems

I am happy for the S3A blocks class to be moved to hadoop-common to address this.

steveloughran · 2021-02-22T14:47:15Z

Maybe I was being pessimistic there. If the #of active writes a single stream can have active is throttled, the #of open blocks a single stream can have allocated is also blocked. But: ability to buffer on disk is the way to robustly avoid scale issues with many active threads.

HADOOP-17195. ABFS Store thread pool for stream IO.

64a18b0

Change-Id: I6915539cfafe7164c404dfc153653710280d9bf6

steveloughran marked this pull request as draft September 10, 2020 11:19

steveloughran added fs/azure changes related to azure; submitter must declare test endpoint work in progress PRs still Work in Progress; reviews not expected but still welcome labels Sep 10, 2020

steveloughran closed this Sep 16, 2020

steveloughran deleted the abfs/HADOOP-17195-threadpool branch October 15, 2021 19:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HADOOP-17195. ABFS Store thread pool for stream IO. #2294

HADOOP-17195. ABFS Store thread pool for stream IO. #2294

Uh oh!

steveloughran commented Sep 9, 2020

Uh oh!

hadoop-yetus commented Sep 9, 2020

Uh oh!

steveloughran commented Sep 10, 2020

Uh oh!

steveloughran commented Sep 16, 2020

Uh oh!

steveloughran commented Feb 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

HADOOP-17195. ABFS Store thread pool for stream IO. #2294

HADOOP-17195. ABFS Store thread pool for stream IO. #2294

Uh oh!

Conversation

steveloughran commented Sep 9, 2020

Uh oh!

hadoop-yetus commented Sep 9, 2020

Uh oh!

steveloughran commented Sep 10, 2020

Uh oh!

steveloughran commented Sep 16, 2020

Uh oh!

steveloughran commented Feb 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants