HBASE-28453 FixedIntervalRateLimiter support for a shorter refill interval #5773

rmdmattingly · 2024-03-22T13:45:20Z

See https://issues.apache.org/jira/browse/HBASE-28453

The AverageIntervalRateLimiter causes tiny wait intervals which can result in DDOS at worst, or poor UX at best. See below where we implemented a 10k request/second/machine quota at 10:43 and saw requests fall into immediate retry loops and inundate the cluster:

The FixedIntervalRateLimiter causes large wait intervals which make it difficult to fully utilize a quota. See below where we were stuck only utilizing ~20% of our quota consistently. After restarting the cluster to use a FIRL with a 100ms refill interval we were able to achieve much better utilization:

As suggested above, this PR introduces support for a refill interval that is <= the TimeUnit of a FixedIntervalRateLimiter. This means that you can define a quota in a straightforward way, like 100MB/sec, while also acknowledging, for example, that you're willing to refill it every 100ms — suggesting that your retries for small/normal requests will often be ~100ms.

Simply set hbase.quota.rate.limiter.refill.interval.ms to your desired refill interval, and restart your RegionServers, to make use of this feature. By default the refill interval will just equal the TimeUnit, so this is a no-op without explicit configuration.

Here's an initial look at how a 100ms refill interval changed our wait interval percentiles in our QA environment:

@hgromer @eab148 @bozzkar @bbeaudreault

Apache-HBase · 2024-03-22T14:06:52Z

💔 -1 overall

Vote	Subsystem	Runtime	Comment
+0 🆗	reexec	0m 43s	Docker mode activated.
		_ Prechecks _
+1 💚	dupname	0m 0s	No case conflicting files found.
+1 💚	hbaseanti	0m 0s	Patch does not have any anti-patterns.
+1 💚	@author	0m 0s	The patch does not contain any @author tags.
		_ master Compile Tests _
+1 💚	mvninstall	2m 59s	master passed
+1 💚	compile	2m 31s	master passed
+1 💚	checkstyle	0m 35s	master passed
+1 💚	spotless	0m 44s	branch has no errors when running spotless:check.
+1 💚	spotbugs	1m 32s	master passed
		_ Patch Compile Tests _
-1 ❌	mvninstall	1m 40s	root in the patch failed.
-1 ❌	compile	0m 20s	hbase-server in the patch failed.
-0 ⚠️	javac	0m 20s	hbase-server in the patch failed.
-0 ⚠️	checkstyle	0m 42s	hbase-server: The patch generated 2 new + 1 unchanged - 0 fixed = 3 total (was 1)
+1 💚	whitespace	0m 0s	The patch has no whitespace issues.
-1 ❌	hadoopcheck	2m 4s	The patch causes 10 errors with Hadoop v3.3.6.
+1 💚	spotless	0m 53s	patch has no errors when running spotless:check.
-1 ❌	spotbugs	0m 16s	hbase-server in the patch failed.
		_ Other Tests _
+1 💚	asflicense	0m 10s	The patch does not generate ASF License warnings.
		16m 18s

Subsystem	Report/Notes
Docker	ClientAPI=1.45 ServerAPI=1.45 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#5773
Optional Tests	dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname	Linux dc39d5700e19 5.4.0-169-generic #187-Ubuntu SMP Thu Nov 23 14:52:28 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `ade6ab2`
Default Java	Eclipse Adoptium-11.0.17+8
mvninstall	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/patch-mvninstall-root.txt
compile	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/patch-compile-hbase-server.txt
javac	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/patch-compile-hbase-server.txt
checkstyle	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt
hadoopcheck	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/patch-javac-3.3.6.txt
spotbugs	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/artifact/yetus-general-check/output/patch-spotbugs-hbase-server.txt
Max. process+thread count	80 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5773/1/console
versions	git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by	Apache Yetus 0.12.0 https://yetus.apache.org