HBASE-28672 Ensure large batches are not indefinitely blocked by quotas #6003

rmdmattingly · 2024-06-19T12:47:36Z

https://issues.apache.org/jira/browse/HBASE-28672

At my day job we are trying to implement default quotas for a variety of access patterns. We began by introducing a default read IO limit per-user, per-machine — this has been very successful in reducing hotspots, even on clusters with thousands of distinct users.

While implementing a default writes/second throttle, I realized that doing so would put us in a precarious situation where large enough batches may never succeed. If your batch size is greater than your TimeLimiter's max throughput, then you will always fail in the quota estimation stage; this will stick the client in a backoff-and-retry loop until it times out. Meanwhile IO estimates are more optimistic, deliberately, which can let large requests do targeted oversubscription of an IO quota:

// assume 1 block required for reads. this is probably a low estimate, which is okay
readConsumed = numReads > 0 ? blockSizeBytes : 0;

This is okay because the Limiter's availability will go negative and force a longer backoff on subsequent requests. I believe this is preferable UX compared to a doomed throttling loop.

In this PR I introduce batch estimation that takes the limiter's max throughput into account, and I've added unit tests to validate the intended behavior.

@ndimiduk @hgromer @bozzkar

ndimiduk

I like this improvement and the behavior you demonstrate in the tests. It seems like a fair kind of latching mechanism for letting requests trickle through.

Previously, this limiting system acted as a gate keeper in terms of placing an effective upper bound on a user. This patch removes that upper bound. Should we consider also introducing explicitly a hard-cap/upper bound?

rmdmattingly · 2024-06-20T09:49:31Z

That's a good question. Despite the existing behavior, I think the intention was always for this to be more of a soft limit — the TimedQuota proto actually calls the field soft_limit:

message TimedQuota {
  required TimeUnit time_unit = 1;
  optional uint64 soft_limit  = 2;
  optional float share = 3;
  optional QuotaScope scope  = 4 [default = MACHINE];
}

That said, intention is one thing and reality is another. I could also see people missing the hard limits if that's how they're using quotas.

I'd probably prefer to make the addition of a hard limit a separate effort, though, because my guess is that it will actually be quite a bit of work relative to its value. For example, coming up with good enough estimations of IO-heavy request load to earnestly guarantee any sort of "hard limit" is going to be tricky/impossible

bbeaudreault · 2024-06-21T12:25:00Z

Ideally for a multi request it would be able to execute whatever actions fit within the available quota. This maintains the hard limit but also gives a user more likelihood to eventually finish their request through multiple rpc calls. Similar to the handle of MultiActionResultTooLarge.

rmdmattingly · 2024-06-21T12:34:42Z

I agree with you, and considered the MARTL approach internally but didn't do a good job of documenting this publicly.

My pushback is not that we shouldn't add support for some sort of MultiActionBatchTooLarge exception, but rather that it would be a pretty significant lift and is, imo, not necessarily in the scope of this issue. It's always been possible to oversubscribe IO quotas, and this is advertised as a soft limit. Despite it being a soft limit, there's a bug here which throws certain request shapes into indefinite backoff/retry loops.

It's definitely righteous work to also think about improving the server's respect of the limit via request chunking, but I think that's sort of its own task and, given the difference in changeset required, I don't think a lack of MABTL exceptions should block our fixing of this bug

bbeaudreault

Sounds good. I agree we shouldn't let this create an impossible to succeed request. If we wanted to do that, we should throw a DoNotRetryException instead of a throttling exception.

bbeaudreault · 2024-06-21T12:56:01Z

hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestDefaultOperationQuota.java

+    // the next request should be rejected
+    assertThrows(RpcThrottlingException.class, () -> quota.checkBatchQuota(0, 1));
+
+    Thread.sleep(1000);


I think it's preferable not to sleep. Instead usually you can influence the timestamp by adding 1s to it or whatever.

The reason not to sleep is that you're more subject to random Jenkins slowdowns or other problems. We do it in a lot of places of course, but iirc the quota stuff already can do some timestamp manipulation?

bbeaudreault · 2024-06-21T13:17:22Z

Would we have the same problem with write size?

rmdmattingly · 2024-06-21T13:29:18Z

Oh yeah, good point. I bet we would. I can look into a fix there too

rmdmattingly · 2024-06-21T13:34:10Z

Actually, there's no realistic problem with write size. At least, no more than read size. Both make very naive lowball estimates based on the number of reads/writes:

hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/DefaultOperationQuota.java

Lines 191 to 203 in 1ffa127

    
           protected void updateEstimateConsumeBatchQuota(int numWrites, int numReads) { 
        
             writeConsumed = estimateConsume(OperationType.MUTATE, numWrites, 100); 
        
             if (useResultSizeBytes) { 
        
               readConsumed = estimateConsume(OperationType.GET, numReads, 100); 
        
             } else { 
        
               // assume 1 block required for reads. this is probably a low estimate, which is okay 
        
               readConsumed = numReads > 0 ? blockSizeBytes : 0; 
        
             } 
        
             writeCapacityUnitConsumed = calculateWriteCapacityUnit(writeConsumed); 
        
             readCapacityUnitConsumed = calculateReadCapacityUnit(readConsumed); 
        
           }

It's theoretically possible for the estimate to exceed the max IO allowed though, so maybe it's worth putting limits in place for both the read and write size estimates here? What do you guys think?

rmdmattingly · 2024-06-24T16:14:16Z

I pushed an addition which ensures we don't block requests indefinitely for read, write, or request size estimates

hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestDefaultOperationQuota.java

Apache-HBase · 2024-06-28T16:37:52Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 35s		Docker mode activated.
			_ Prechecks _
+1 💚	dupname	0m 0s		No case conflicting files found.
+0 🆗	codespell	0m 0s		codespell was not available.
+0 🆗	detsecrets	0m 0s		detect-secrets was not available.
+1 💚	@author	0m 0s		The patch does not contain any @author tags.
+1 💚	hbaseanti	0m 0s		Patch does not have any anti-patterns.
			_ master Compile Tests _
+1 💚	mvninstall	3m 53s		master passed
+1 💚	compile	3m 37s		master passed
+1 💚	checkstyle	0m 51s		master passed
+1 💚	spotbugs	1m 51s		master passed
+1 💚	spotless	0m 53s		branch has no errors when running spotless:check.
			_ Patch Compile Tests _
+1 💚	mvninstall	2m 51s		the patch passed
+1 💚	compile	2m 56s		the patch passed
+1 💚	javac	2m 56s		the patch passed
+1 💚	blanks	0m 0s		The patch has no blanks issues.
+1 💚	checkstyle	0m 34s		the patch passed
+1 💚	spotbugs	1m 37s		the patch passed
+1 💚	hadoopcheck	5m 14s		Patch does not cause any errors with Hadoop 3.3.6.
+1 💚	spotless	0m 40s		patch has no errors when running spotless:check.
			_ Other Tests _
+1 💚	asflicense	0m 9s		The patch does not generate ASF License warnings.
		32m 13s

Subsystem	Report/Notes
Docker	ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6003/4/artifact/yetus-general-check/output/Dockerfile
GITHUB PR	#6003
Optional Tests	dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname	Linux 54b12a07b010 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `15b5866`
Default Java	Eclipse Adoptium-17.0.10+7
Max. process+thread count	87 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6003/4/console
versions	git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

Apache-HBase · 2024-06-28T20:10:05Z

🎊 +1 overall

Vote	Subsystem	Runtime	Logfile	Comment
+0 🆗	reexec	0m 37s		Docker mode activated.
-0 ⚠️	yetus	0m 2s		Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
			_ Prechecks _
			_ master Compile Tests _
+1 💚	mvninstall	2m 59s		master passed
+1 💚	compile	0m 54s		master passed
+1 💚	javadoc	0m 29s		master passed
+1 💚	shadedjars	5m 11s		branch has no errors when building our shaded downstream artifacts.
			_ Patch Compile Tests _
+1 💚	mvninstall	2m 46s		the patch passed
+1 💚	compile	0m 54s		the patch passed
+1 💚	javac	0m 54s		the patch passed
+1 💚	javadoc	0m 28s		the patch passed
+1 💚	shadedjars	5m 13s		patch has no errors when building our shaded downstream artifacts.
			_ Other Tests _
+1 💚	unit	220m 30s		hbase-server in the patch passed.
		244m 29s

Subsystem	Report/Notes
Docker	ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6003/4/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR	#6003
Optional Tests	javac javadoc unit compile shadedjars
uname	Linux 85783adace03 5.4.0-186-generic #206-Ubuntu SMP Fri Apr 26 12:31:10 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool	maven
Personality	dev-support/hbase-personality.sh
git revision	master / `15b5866`
Default Java	Eclipse Adoptium-17.0.10+7
Test Results	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6003/4/testReport/
Max. process+thread count	4848 (vs. ulimit of 30000)
modules	C: hbase-server U: hbase-server
Console output	https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6003/4/console
versions	git=2.34.1 maven=3.8.6
Powered by	Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

…as (apache#6003) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault < bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>

…as (#6003) Co-authored-by: Ray Mattingly <rmattingly@hubspot.com> Signed-off-by: Bryan Beaudreault < bbeaudreault@apache.org> Signed-off-by: Nick Dimiduk <ndimiduk@apache.org>

HBASE-28672 ensure large batches are not indefinitely blocked by quotas

1ffa127

This comment has been minimized.

Sign in to view

ndimiduk approved these changes Jun 20, 2024

View reviewed changes

bbeaudreault approved these changes Jun 21, 2024

View reviewed changes

speed up TestDefaultOperationQuota

89feac8

This comment has been minimized.

Sign in to view

Prevent quota blocking based on read, write, or request size estimates

c5cccfc

This comment has been minimized.

Sign in to view

ndimiduk reviewed Jun 26, 2024

View reviewed changes

hbase-server/src/test/java/org/apache/hadoop/hbase/quotas/TestDefaultOperationQuota.java Show resolved Hide resolved

additional test

15b5866

ndimiduk merged commit 5082212 into apache:master Jul 8, 2024
1 check passed

ndimiduk deleted the HBASE-28672 branch July 10, 2024 14:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HBASE-28672 Ensure large batches are not indefinitely blocked by quotas #6003

HBASE-28672 Ensure large batches are not indefinitely blocked by quotas #6003

rmdmattingly commented Jun 19, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

ndimiduk left a comment

rmdmattingly commented Jun 20, 2024

bbeaudreault commented Jun 21, 2024 •

edited

Loading

rmdmattingly commented Jun 21, 2024

bbeaudreault left a comment

bbeaudreault Jun 21, 2024

bbeaudreault commented Jun 21, 2024

rmdmattingly commented Jun 21, 2024

rmdmattingly commented Jun 21, 2024 •

edited

Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

rmdmattingly commented Jun 24, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Apache-HBase commented Jun 28, 2024

Apache-HBase commented Jun 28, 2024

HBASE-28672 Ensure large batches are not indefinitely blocked by quotas #6003

HBASE-28672 Ensure large batches are not indefinitely blocked by quotas #6003

Conversation

rmdmattingly commented Jun 19, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

ndimiduk left a comment

Choose a reason for hiding this comment

rmdmattingly commented Jun 20, 2024

bbeaudreault commented Jun 21, 2024 • edited Loading

rmdmattingly commented Jun 21, 2024

bbeaudreault left a comment

Choose a reason for hiding this comment

bbeaudreault Jun 21, 2024

Choose a reason for hiding this comment

bbeaudreault commented Jun 21, 2024

rmdmattingly commented Jun 21, 2024

rmdmattingly commented Jun 21, 2024 • edited Loading

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

rmdmattingly commented Jun 24, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

Apache-HBase commented Jun 28, 2024

Apache-HBase commented Jun 28, 2024

bbeaudreault commented Jun 21, 2024 •

edited

Loading

rmdmattingly commented Jun 21, 2024 •

edited

Loading