-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-23213 : Reopen regions with very high Store Ref Counts(backport… #761
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
e52c85e
to
88232a9
Compare
💔 -1 overall
This message was automatically generated. |
TestAdmin1 is passing locally
|
This is branch-1 backport PR. |
/** | ||
* @return map of the names of region servers on the live list with associated ServerLoad | ||
*/ | ||
public Map<ServerName, ServerLoad> getLiveServersLoad() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additive change to Public interface is fine. We are already also changing RegionLoad.
* The max number of references active on single store file among all store files | ||
* that belong to given region | ||
*/ | ||
optional int32 max_store_file_ref_count = 22 [default = 0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked branch-2 and master proto def, this is consistent with them, good
+1 from me, merging soon. Shout if you want to stop it. |
I am not very comfortable having this test failing here: This test is not part of our branch-1 flaky tests list:https://builds.apache.org/view/H-L/view/HBase/job/HBase-Find-Flaky-Tests/job/branch-1/lastSuccessfulBuild/artifact/dashboard.html#job_2 And having value 2 here meaning the region merge failed. |
Ok @xcangCRM , holding off |
So the question is this test unstable before this change or caused by this change. @virajjasani if you don't have time to run TestAdmin1 at head of branch-1 in a loop 20 times or so to check that all runs are green before your change (or not), I'll pick it up tomorrow. |
@apurtell @xcangCRM With and without this patch applied, the results remain same:
Will try more runs by EOD and check if any discrepancy is found between branch-1 and this patch. |
ok so now I just ran TestAdmin1.testMergeRegions 50 times in loop with this patch and it went smooth without any failure. |
I'll do the same. If we can't correlate the failure, I will proceed. |
Sounds good to me! thank you. @virajjasani @apurtell |
On my new dev laptop testing TestAdmin1 is problematic:
This happens before applying the patch here. I'm going to bisect to find where this was introduced. Edit: This looks like an old problem with JVMs doing weird things with binding to localhost when on VPN on MacOS. The Azul JVM may be at issue. I will test on a linux host to work around this. |
25 iterations with this patch look good. @xcangCRM if you continue to see instability with TestAdmin1 let's open a JIRA, and please post logs from failed runs there. |
… HBASE-22460)