Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-23213 : Reopen regions with very high Store Ref Counts(backport… #761

Merged
merged 1 commit into from
Oct 29, 2019

Conversation

virajjasani
Copy link
Contributor

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
💙 reexec 1m 19s Docker mode activated.
_ Prechecks _
💚 dupname 0m 0s No case conflicting files found.
💙 prototool 0m 0s prototool was not available.
💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
💚 @author 0m 0s The patch does not contain any @author tags.
💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
💙 mvndep 1m 23s Maven dependency ordering for branch
💚 mvninstall 7m 29s branch-1 passed
💚 compile 1m 39s branch-1 passed with JDK v1.8.0_232
💚 compile 1m 46s branch-1 passed with JDK v1.7.0_242
💚 checkstyle 10m 59s branch-1 passed
💙 refguide 3m 24s branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
💚 shadedjars 2m 50s branch has no errors when building our shaded downstream artifacts.
💚 javadoc 3m 45s branch-1 passed with JDK v1.8.0_232
💚 javadoc 5m 48s branch-1 passed with JDK v1.7.0_242
💙 spotbugs 2m 28s Used deprecated FindBugs config; considering switching to SpotBugs.
💚 findbugs 18m 2s branch-1 passed
_ Patch Compile Tests _
💙 mvndep 0m 17s Maven dependency ordering for patch
💚 mvninstall 1m 58s the patch passed
💚 compile 1m 42s the patch passed with JDK v1.8.0_232
💚 cc 1m 42s the patch passed
💚 javac 1m 42s the patch passed
💚 compile 1m 46s the patch passed with JDK v1.7.0_242
💚 cc 1m 46s the patch passed
💚 javac 1m 46s the patch passed
💚 checkstyle 10m 54s the patch passed
💚 whitespace 0m 0s The patch has no whitespace issues.
💔 xml 0m 0s The patch has 1 ill-formed XML file(s).
💙 refguide 3m 0s patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
💚 shadedjars 2m 44s patch has no errors when building our shaded downstream artifacts.
💚 hadoopcheck 4m 59s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
💚 hbaseprotoc 4m 24s the patch passed
💚 javadoc 3m 52s the patch passed with JDK v1.8.0_232
💚 javadoc 5m 55s the patch passed with JDK v1.7.0_242
💔 findbugs 2m 55s hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
💔 findbugs 10m 29s root generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
_ Other Tests _
💔 unit 166m 43s root in the patch failed.
💚 asflicense 2m 50s The patch does not generate ASF License warnings.
295m 4s
Reason Tests
XML Parsing Error(s):
hbase-common/src/main/resources/hbase-default.xml
FindBugs module:hbase-server
org.apache.hadoop.hbase.master.RegionsRecoveryChore.chore() makes inefficient use of keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:[line 104]
FindBugs module:root
org.apache.hadoop.hbase.master.RegionsRecoveryChore.chore() makes inefficient use of keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:keySet iterator instead of entrySet iterator At RegionsRecoveryChore.java:[line 104]
Failed junit tests hadoop.hbase.client.TestAdmin1
hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/Dockerfile
GITHUB PR #761
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml cc hbaseprotoc prototool
uname Linux 8a4938041612 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-761/out/precommit/personality/provided.sh
git revision branch-1 / db2ce23
Default Java 1.7.0_242
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:1.8.0_232 /usr/lib/jvm/zulu-7-amd64:1.7.0_242
refguide https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/branch-site/book.html
xml https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/xml.txt
refguide https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/patch-site/book.html
findbugs https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/new-findbugs-hbase-server.html
findbugs https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/new-findbugs-root.html
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/artifact/out/patch-unit-root.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/testReport/
Max. process+thread count 4455 (vs. ulimit of 10000)
modules C: hbase-protocol hbase-common hbase-client hbase-hadoop-compat hbase-hadoop2-compat hbase-server . U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/1/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.11.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
💙 reexec 1m 24s Docker mode activated.
_ Prechecks _
💚 dupname 0m 1s No case conflicting files found.
💙 prototool 0m 0s prototool was not available.
💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
💚 @author 0m 0s The patch does not contain any @author tags.
💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
💙 mvndep 1m 21s Maven dependency ordering for branch
💚 mvninstall 7m 39s branch-1 passed
💚 compile 1m 46s branch-1 passed with JDK v1.8.0_232
💚 compile 1m 54s branch-1 passed with JDK v1.7.0_242
💚 checkstyle 11m 40s branch-1 passed
💙 refguide 3m 37s branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
💚 shadedjars 2m 54s branch has no errors when building our shaded downstream artifacts.
💚 javadoc 3m 56s branch-1 passed with JDK v1.8.0_232
💚 javadoc 6m 13s branch-1 passed with JDK v1.7.0_242
💙 spotbugs 2m 40s Used deprecated FindBugs config; considering switching to SpotBugs.
💚 findbugs 19m 24s branch-1 passed
_ Patch Compile Tests _
💙 mvndep 0m 17s Maven dependency ordering for patch
💚 mvninstall 2m 5s the patch passed
💚 compile 1m 46s the patch passed with JDK v1.8.0_232
💚 cc 1m 46s the patch passed
💚 javac 1m 46s the patch passed
💚 compile 1m 50s the patch passed with JDK v1.7.0_242
💚 cc 1m 50s the patch passed
💚 javac 1m 50s the patch passed
💚 checkstyle 12m 36s the patch passed
💚 whitespace 0m 0s The patch has no whitespace issues.
💔 xml 0m 0s The patch has 1 ill-formed XML file(s).
💙 refguide 3m 53s patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
💚 shadedjars 3m 21s patch has no errors when building our shaded downstream artifacts.
💚 hadoopcheck 5m 57s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
💚 hbaseprotoc 4m 58s the patch passed
💚 javadoc 4m 14s the patch passed with JDK v1.8.0_232
💚 javadoc 6m 54s the patch passed with JDK v1.7.0_242
💚 findbugs 23m 11s the patch passed
_ Other Tests _
💔 unit 224m 34s root in the patch failed.
💚 asflicense 3m 0s The patch does not generate ASF License warnings.
366m 55s
Reason Tests
XML Parsing Error(s):
hbase-common/src/main/resources/hbase-default.xml
Failed junit tests hadoop.hbase.master.normalizer.TestSimpleRegionNormalizerOnCluster
hadoop.hbase.client.TestAdmin1
hadoop.hbase.client.TestReplicaWithCluster
hadoop.hbase.client.replication.TestReplicationAdminWithTwoDifferentZKClusters
hadoop.hbase.master.TestMasterBalanceThrottling
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/Dockerfile
GITHUB PR #761
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml cc hbaseprotoc prototool
uname Linux ab9410c5fcf9 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-761/out/precommit/personality/provided.sh
git revision branch-1 / db2ce23
Default Java 1.7.0_242
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:1.8.0_232 /usr/lib/jvm/zulu-7-amd64:1.7.0_242
refguide https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/branch-site/book.html
xml https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/xml.txt
refguide https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/patch-site/book.html
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/artifact/out/patch-unit-root.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/testReport/
Max. process+thread count 4388 (vs. ulimit of 10000)
modules C: hbase-protocol hbase-common hbase-client hbase-hadoop-compat hbase-hadoop2-compat hbase-server . U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/2/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.11.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
💙 reexec 2m 15s Docker mode activated.
_ Prechecks _
💚 dupname 0m 0s No case conflicting files found.
💙 prototool 0m 1s prototool was not available.
💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
💚 @author 0m 0s The patch does not contain any @author tags.
💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ branch-1 Compile Tests _
💙 mvndep 1m 24s Maven dependency ordering for branch
💚 mvninstall 7m 56s branch-1 passed
💚 compile 2m 6s branch-1 passed with JDK v1.8.0_232
💚 compile 2m 22s branch-1 passed with JDK v1.7.0_242
💚 checkstyle 12m 28s branch-1 passed
💙 refguide 3m 29s branch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
💚 shadedjars 3m 9s branch has no errors when building our shaded downstream artifacts.
💚 javadoc 3m 58s branch-1 passed with JDK v1.8.0_232
💚 javadoc 6m 12s branch-1 passed with JDK v1.7.0_242
💙 spotbugs 2m 41s Used deprecated FindBugs config; considering switching to SpotBugs.
💚 findbugs 19m 22s branch-1 passed
_ Patch Compile Tests _
💙 mvndep 0m 16s Maven dependency ordering for patch
💚 mvninstall 2m 7s the patch passed
💚 compile 1m 44s the patch passed with JDK v1.8.0_232
💚 cc 1m 44s the patch passed
💚 javac 1m 44s the patch passed
💚 compile 1m 52s the patch passed with JDK v1.7.0_242
💚 cc 1m 52s the patch passed
💚 javac 1m 52s the patch passed
💚 checkstyle 11m 13s the patch passed
💚 whitespace 0m 0s The patch has no whitespace issues.
💔 xml 0m 1s The patch has 1 ill-formed XML file(s).
💙 refguide 3m 2s patch has no errors when building the reference guide. See footer for rendered docs, which you should manually inspect.
💚 shadedjars 2m 58s patch has no errors when building our shaded downstream artifacts.
💚 hadoopcheck 5m 20s Patch does not cause any errors with Hadoop 2.8.5 2.9.2.
💚 hbaseprotoc 4m 27s the patch passed
💚 javadoc 3m 51s the patch passed with JDK v1.8.0_232
💚 javadoc 6m 3s the patch passed with JDK v1.7.0_242
💚 findbugs 20m 38s the patch passed
_ Other Tests _
💔 unit 175m 20s root in the patch failed.
💚 asflicense 2m 49s The patch does not generate ASF License warnings.
312m 20s
Reason Tests
XML Parsing Error(s):
hbase-common/src/main/resources/hbase-default.xml
Failed junit tests hadoop.hbase.client.TestAdmin1
Subsystem Report/Notes
Docker Client=19.03.4 Server=19.03.4 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/artifact/out/Dockerfile
GITHUB PR #761
Optional Tests dupname asflicense javac javadoc unit spotbugs findbugs shadedjars hadoopcheck hbaseanti checkstyle compile refguide xml cc hbaseprotoc prototool
uname Linux 73bf6d6cd2e1 4.15.0-65-generic #74-Ubuntu SMP Tue Sep 17 17:06:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality /home/jenkins/jenkins-slave/workspace/HBase-PreCommit-GitHub-PR_PR-761/out/precommit/personality/provided.sh
git revision branch-1 / db2ce23
Default Java 1.7.0_242
Multi-JDK versions /usr/lib/jvm/zulu-8-amd64:1.8.0_232 /usr/lib/jvm/zulu-7-amd64:1.7.0_242
refguide https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/artifact/out/branch-site/book.html
xml https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/artifact/out/xml.txt
refguide https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/artifact/out/patch-site/book.html
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/artifact/out/patch-unit-root.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/testReport/
Max. process+thread count 4423 (vs. ulimit of 10000)
modules C: hbase-protocol hbase-common hbase-client hbase-hadoop-compat hbase-hadoop2-compat hbase-server . U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-761/3/console
versions git=1.9.1 maven=3.0.5 findbugs=3.0.1
Powered by Apache Yetus 0.11.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani
Copy link
Contributor Author

TestAdmin1 is passing locally

[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hadoop.hbase.client.TestAdmin1
[INFO] Tests run: 29, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 152.323 s - in org.apache.hadoop.hbase.client.TestAdmin1
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 29, Failures: 0, Errors: 0, Skipped: 0
[INFO] 

@virajjasani
Copy link
Contributor Author

This is branch-1 backport PR.
Please review @anoopsjohn @apurtell

/**
* @return map of the names of region servers on the live list with associated ServerLoad
*/
public Map<ServerName, ServerLoad> getLiveServersLoad() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additive change to Public interface is fine. We are already also changing RegionLoad.

* The max number of references active on single store file among all store files
* that belong to given region
*/
optional int32 max_store_file_ref_count = 22 [default = 0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checked branch-2 and master proto def, this is consistent with them, good

@apurtell
Copy link
Contributor

+1 from me, merging soon. Shout if you want to stop it.

@xcangCRM
Copy link
Contributor

xcangCRM commented Oct 28, 2019

I am not very comfortable having this test failing here:
org.apache.hadoop.hbase.client.TestAdmin1.testMergeRegions
Error Details
expected:<1> but was:<2>
Stack Trace
java.lang.AssertionError: expected:<1> but was:<2>
at org.apache.hadoop.hbase.client.TestAdmin1.testMergeRegions(TestAdmin1.java:1489)
Standard Output
Standard Error

This test is not part of our branch-1 flaky tests list:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html#job_2

And having value 2 here meaning the region merge failed.

@apurtell
Copy link
Contributor

Ok @xcangCRM , holding off

@apurtell
Copy link
Contributor

So the question is this test unstable before this change or caused by this change. @virajjasani if you don't have time to run TestAdmin1 at head of branch-1 in a loop 20 times or so to check that all runs are green before your change (or not), I'll pick it up tomorrow.

@virajjasani
Copy link
Contributor Author

virajjasani commented Oct 29, 2019

@apurtell @xcangCRM
Just tried a quick test

With and without this patch applied, the results remain same:

  1. ran TestAdmin1.testMergeRegions 5 times in loop: successful
  2. ran TestAdmin1.testMergeRegions 10 times in loop: successful
  3. ran TestAdmin1.testMergeRegions 25 times in loop: fails

Will try more runs by EOD and check if any discrepancy is found between branch-1 and this patch.

@virajjasani
Copy link
Contributor Author

ok so now I just ran TestAdmin1.testMergeRegions 50 times in loop with this patch and it went smooth without any failure.

@apurtell
Copy link
Contributor

I'll do the same. If we can't correlate the failure, I will proceed.

@xcangCRM
Copy link
Contributor

Sounds good to me! thank you. @virajjasani @apurtell

@apurtell
Copy link
Contributor

apurtell commented Oct 29, 2019

On my new dev laptop testing TestAdmin1 is problematic:

[INFO] Running org.apache.hadoop.hbase.client.TestAdmin1
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 4.365 s <<< FAILURE! - in org.apache.hadoop.hbase.client.TestAdmin1
[ERROR] org.apache.hadoop.hbase.client.TestAdmin1  Time elapsed: 4.362 s  <<< ERROR!
java.io.IOException: Shutting down
at org.apache.hadoop.hbase.client.TestAdmin1.setUpBeforeClass(TestAdmin1.java:101)
Caused by: java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMasterCan't assign requested address
at org.apache.hadoop.hbase.client.TestAdmin1.setUpBeforeClass(TestAdmin1.java:101)
Caused by: java.io.IOException: Problem binding to /10.29.170.32:0 : Can't assign requested address. To switch ports use the 'hbase.master.port' configuration property.
at org.apache.hadoop.hbase.client.TestAdmin1.setUpBeforeClass(TestAdmin1.java:101)
Caused by: java.net.BindException: Can't assign requested address
at org.apache.hadoop.hbase.client.TestAdmin1.setUpBeforeClass(TestAdmin1.java:101)

This happens before applying the patch here. I'm going to bisect to find where this was introduced.

Edit: This looks like an old problem with JVMs doing weird things with binding to localhost when on VPN on MacOS. The Azul JVM may be at issue. I will test on a linux host to work around this.

@apurtell
Copy link
Contributor

25 iterations with this patch look good. @xcangCRM if you continue to see instability with TestAdmin1 let's open a JIRA, and please post logs from failed runs there.

@apurtell apurtell merged commit 5e414f2 into apache:branch-1 Oct 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants