Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-24396 : RetryCounter#sleepUntilNextRetry and ThrottledInputStre… #1765

Closed
wants to merge 1 commit into from

Conversation

virajjasani
Copy link
Contributor

…am#throttle should use uninterrupted sleep

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…am#throttle should use uninterrupted sleep
@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 30s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 22s Maven dependency ordering for branch
+1 💚 mvninstall 3m 45s master passed
+1 💚 checkstyle 1m 48s master passed
+1 💚 spotbugs 2m 40s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 17s the patch passed
-0 ⚠️ checkstyle 0m 24s hbase-common: The patch generated 1 new + 5 unchanged - 0 fixed = 6 total (was 5)
-0 ⚠️ checkstyle 1m 5s hbase-server: The patch generated 1 new + 75 unchanged - 0 fixed = 76 total (was 75)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 11m 4s Patch does not cause any errors with Hadoop 3.1.2 3.2.1.
+1 💚 spotbugs 3m 9s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
37m 26s
Subsystem Report/Notes
Docker Client=19.03.9 Server=19.03.9 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1765
Optional Tests dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname Linux e79296a098fb 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / a9fefd7
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-common.txt
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-server.txt
Max. process+thread count 94 (vs. ulimit of 12500)
modules C: hbase-common hbase-server hbase-it U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 29s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 22s Maven dependency ordering for branch
+1 💚 mvninstall 3m 39s master passed
+1 💚 compile 1m 44s master passed
+1 💚 shadedjars 5m 31s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 13s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 17s Maven dependency ordering for patch
+1 💚 mvninstall 3m 24s the patch passed
+1 💚 compile 1m 44s the patch passed
+1 💚 javac 1m 44s the patch passed
+1 💚 shadedjars 5m 39s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 12s the patch passed
_ Other Tests _
+1 💚 unit 1m 20s hbase-common in the patch passed.
+1 💚 unit 132m 19s hbase-server in the patch passed.
+1 💚 unit 1m 11s hbase-it in the patch passed.
162m 41s
Subsystem Report/Notes
Docker Client=19.03.9 Server=19.03.9 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #1765
Optional Tests javac javadoc unit shadedjars compile
uname Linux fbc869a72dd2 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / a9fefd7
Default Java 1.8.0_232
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/testReport/
Max. process+thread count 5250 (vs. ulimit of 12500)
modules C: hbase-common hbase-server hbase-it U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 6s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 19s Maven dependency ordering for branch
+1 💚 mvninstall 4m 40s master passed
+1 💚 compile 2m 3s master passed
+1 💚 shadedjars 6m 25s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 18s hbase-common in master failed.
-0 ⚠️ javadoc 0m 40s hbase-server in master failed.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 4m 31s the patch passed
+1 💚 compile 2m 3s the patch passed
+1 💚 javac 2m 3s the patch passed
+1 💚 shadedjars 6m 24s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 17s hbase-common in the patch failed.
-0 ⚠️ javadoc 0m 39s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 1m 56s hbase-common in the patch passed.
-1 ❌ unit 200m 37s hbase-server in the patch failed.
+1 💚 unit 1m 15s hbase-it in the patch passed.
235m 54s
Subsystem Report/Notes
Docker Client=19.03.9 Server=19.03.9 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #1765
Optional Tests javac javadoc unit shadedjars compile
uname Linux 7b1e653a7dd5 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / a9fefd7
Default Java 2020-01-14
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/testReport/
Max. process+thread count 2557 (vs. ulimit of 12500)
modules C: hbase-common hbase-server hbase-it U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1765/1/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

} catch (InterruptedException e) {
throw new InterruptedIOException("Thread aborted");
}
Uninterruptibles.sleepUninterruptibly(sleepTime, TimeUnit.MILLISECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to this jira? When during the throttle sleep, if we interrupt this thread, it would have come out of sleep by throwing an IOE as per the current code. But a call to sleepUninterruptibly will make sure the thread is in sleep state for that much time. The interrupt might be for a genuine case to stop the running thread. Now this change will make it such that even if been interrupted, the thread will still continue to be executed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, throttle()'s purpose should ideally be to throttle without any interruption.

} catch (InterruptedException e) {
throw new RuntimeException(e);
}
retryCounter.sleepUntilNextRetry();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also we are changing the behave. Previously throw RTE when interrupted. But now this is been changed Main thing is even if interrupted, the RetryCounter will make sure the thread been slept for the specified time (Which might not be really wanted some times)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the purpose of retryCounter.sleepUntilNextRetry() should be uninterrupted sleep because RetryCounter is mainly being used by retries with sleeps and retries with different backoff policies. In such scenario, RetryCounter being a library should not ideally throw InterruptedException even if sleep is interrupted because it is being retried by clients to achieve certain tasks.

@virajjasani
Copy link
Contributor Author

virajjasani commented May 23, 2020

Both RetryCounter and ThrottledInputStream are Private.IA being used by clients to retry/throttle with sleep and IMHO, the sleep used internally by these libraries should be uninterruptible and clients should not worry about handling InterruptedException because clients will require smooth retries with backoff.

@@ -130,11 +130,7 @@ void triggerFlushInPrimaryRegion(final HRegion region) throws IOException {
ServerRegionReplicaUtil.getRegionInfoForDefaultReplica(region.getRegionInfo())
.getRegionNameAsString(),
region.getRegionInfo().getRegionNameAsString(), counter.getAttemptTimes(), e);
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You did not get what I was saying I believe.
This is triggerFlushInPrimaryRegion() method and u can see a while loop within which the call happening. The while loop is to be terminated once this RS is set to be stopped/abort. When such happens, there might be many non daemon threads running within the server. Our logic in different places will interrupt these threads. so if the thread is sleeping or waiting it will get InterruptedException and allow the thread NOT to continue in running/waiting state. The logic should be checking the server status and allow to come out of loops etc.
But your change will make it such that even if the main thread interrupt this thread, it will continue to sleep for the specified time. That is totally against our intent.

Copy link
Contributor Author

@virajjasani virajjasani May 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok got it. Yes this example makes sense. Are you saying that similar to this one, all the other places also need to handle Interruptions and there is no need to have uninterrupted sleep during client side retries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not check all places.. Most of the places where the RetryCounter is being used is in test I can see. But within RetryCounter we should not change. Tomorrow some other code path might use it too. IMO we can just keep the code as is. Let the calling part handle the InterruptedException the way they want. we can not generalise it. So just close this Jira Viraj

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okk just saw that many places are in asynchronously getting executed in separate threads from main thread and if this is how interruptions were planned to be handled, it's fine. No need to make change.

@virajjasani virajjasani deleted the HBASE-24396-master branch May 24, 2020 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants