-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: add mirror health checking support #800
Conversation
@sctb512 , a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/27397 |
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
d5a87cd
to
d1b21ab
Compare
@sctb512 , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/27453 |
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
d1b21ab
to
8593a35
Compare
@sctb512 , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/29877 |
Related to #821 as well. |
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
b7cedcd
to
bed0097
Compare
@sctb512 , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/29880 |
e6141e5
to
f06200b
Compare
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
We also need this feature in stable/v2.1 :-) |
storage/src/backend/connection.rs
Outdated
// TODO: check mirrors' health | ||
// Check mirrors' health | ||
for mirror in connection.mirrors.iter() { | ||
let conn = connection.clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to extract this for loop into a new function as function new
is already too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I add two new functions: start_proxy_health_thread
and start_mirror_health_thread
.
Currently, the mirror is set to unavailable if the failed times reach failure_limit. We added mirror health checking, which will recover unavailable mirror server. The failure_limit indicates the failed time at which the mirror is set to unavailable. The health_check_interval indicates the time interval to recover the unavailable mirror. The ping_url is the endpoint to check mirror server health. Signed-off-by: Bin Tang <tangbin.bin@bytedance.com>
Forward 401 response to P2P/dragonfly will affect performance. When there is a mirror that auth_through false, we refresh the token regularly to avoid forwarding the 401 response to mirror. Signed-off-by: Bin Tang <tangbin.bin@bytedance.com>
d97294d
to
9f5d136
Compare
@sctb512 , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/30804 |
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
Signed-off-by: Bin Tang <tangbin.bin@bytedance.com>
@sctb512 , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/30812 |
20532a6
to
3181bd6
Compare
@sctb512 , the code has been updated, so a new test job has been submitted. Please wait in patience. The test job url: https://tone.openanolis.cn/ws/nrh4nnio/test_result/30813 |
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
@sctb512 , The CI test is completed, please check result:
Congratulations, your test job passed! |
Currently, the mirror is set to unavailable if the failed times reach failure_limit. We added mirror health checking, which will recover unavailable mirror server. The failure_limit indicates the failed time at which the mirror is set to unavailable, The health_check_interval indicates the time interval to recover the unavailable mirror and the ping_url is the endpoint for checking mirror server health.
This PR is addressing #765
Signed-off-by: Bin Tang tangbin.bin@bytedance.com