-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: make pika compactible with redis-sentinel #2854
fix: make pika compactible with redis-sentinel #2854
Conversation
2. support client kill type pubsub/normal 3. ensure fd is removed in epoll if server wanna close fd
1. ensure NetWork Thread(Dispacher) can be stopped in time 2. ensure all queued Async WriteDB task can be done before exit
WalkthroughThe recent updates enhance the Pika server's functionality by introducing new methods for managing client connections and improving thread operations. Key changes include expanded command handling for diverse client termination types, refined connection management with new signaling methods, and enhanced resource cleanup processes. These improvements aim to create a more robust and efficient operational environment, facilitating smoother execution and management of server commands. Changes
Sequence Diagram(s)sequenceDiagram
participant ClientCmd
participant Server
participant PikaDispatchThread
participant PikaPubSubThread
ClientCmd->>Server: Issue "kill" command
Server->>PikaDispatchThread: Process ClientKillAll()
PikaDispatchThread->>PikaPubSubThread: NotifyCloseAllConns()
PikaDispatchThread-->>Server: Confirmation of closure
Note over ClientCmd, Server: Client connections are now managed and terminated gracefully.
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (11)
- include/pika_admin.h (2 hunks)
- include/pika_dispatch_thread.h (1 hunks)
- include/pika_repl_bgworker.h (1 hunks)
- include/pika_server.h (1 hunks)
- src/net/include/net_pubsub.h (1 hunks)
- src/net/src/net_pubsub.cc (3 hunks)
- src/net/src/worker_thread.cc (2 hunks)
- src/pika_admin.cc (5 hunks)
- src/pika_dispatch_thread.cc (1 hunks)
- src/pika_repl_client.cc (1 hunks)
- src/pika_server.cc (2 hunks)
Additional comments not posted (23)
include/pika_repl_bgworker.h (1)
27-31
: New methodTaskQueueSize
looks good!The method provides a useful utility for retrieving the total size of the task queue by leveraging existing functionality. This addition enhances the class's interface without introducing any breaking changes.
include/pika_dispatch_thread.h (1)
17-17
: New methodStopThread
is a valuable addition!The addition of the
StopThread
method enhances the class's thread management capabilities, providing improved control over the thread lifecycle.src/pika_dispatch_thread.cc (1)
62-64
: Implementation ofStopThread
is well done!The method correctly invokes
StopThread
on thethread_rep_
member, enhancing thread management. This aligns with the goal of improving resource handling and operational control.src/net/include/net_pubsub.h (3)
93-93
: Atomic booleanclose_all_conn_sig_
is well-suited for signaling.The use of an atomic boolean for signaling connection closures is appropriate for thread-safe operations. Ensure it is consistently checked and updated across the codebase.
85-85
: Ensure proper handling of connection closures inCloseAllConns
.The
CloseAllConns
method should ensure all connections are safely closed. Verify that it integrates well with the rest of the connection management logic.Verification successful
CloseAllConns
method handles connection closures properly.The implementation of
CloseAllConns
insrc/net/src/net_pubsub.cc
ensures that all connections are safely closed with appropriate synchronization and resource cleanup. This method integrates well with the connection management logic.
src/net/src/net_pubsub.cc
: Implementation ofCloseAllConns
with synchronization and cleanup logic.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of `CloseAllConns` to ensure all connections are safely closed. # Test: Search for the method implementation. Expect: Proper connection closure logic. ast-grep --lang cpp --pattern $'void CloseAllConns() { $$$ }'Length of output: 64
Script:
#!/bin/bash # Search for the definition and usage of `CloseAllConns` in the codebase. rg 'CloseAllConns' -A 10Length of output: 3111
80-80
: Ensure thread-safety when notifying to close all connections.The method
NotifyToCloseAllConns
likely setsclose_all_conn_sig_
to signal closure. Ensure this method is used in a thread-safe manner, given the concurrent nature of the application.Verification successful
Thread-safety of
NotifyToCloseAllConns
is ensured.The
NotifyToCloseAllConns
method uses an atomic operation to setclose_all_conn_sig_
totrue
, ensuring thread-safety. No additional synchronization is needed in its current usage.
src/net/src/net_pubsub.cc
: Implementation ofNotifyToCloseAllConns
usesclose_all_conn_sig_.store(true)
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify usage of `NotifyToCloseAllConns` to ensure thread-safety. # Test: Search for the method usage. Expect: Proper synchronization mechanisms in place. rg --type cpp $'NotifyToCloseAllConns'Length of output: 303
Script:
#!/bin/bash # Description: Retrieve the implementation of `NotifyToCloseAllConns` to check for thread-safety mechanisms. ast-grep --lang cpp --pattern 'void PubSubThread::NotifyToCloseAllConns() { $$$ }' # Description: Retrieve the context around the usage of `NotifyToCloseAllConns` in `pika_server.cc`. rg --type cpp -A 5 -B 5 $'NotifyToCloseAllConns' src/pika_server.ccLength of output: 496
Script:
#!/bin/bash # Description: Search for the implementation of `NotifyToCloseAllConns` in `net_pubsub.cc`. rg --type cpp -A 10 'void PubSubThread::NotifyToCloseAllConns()' src/net/src/net_pubsub.ccLength of output: 195
src/net/src/worker_thread.cc (2)
275-277
: Ensure proper resource cleanup for closed connections.The addition of
net_multiplexer_->NetDelEvent(conn->fd(), 0);
ensures that network events are deregistered for closed connections. This is crucial for preventing resource leaks.
281-283
: Ensure timeout handling correctly deregisters events.The changes ensure that connections timing out have their events deregistered and resources cleaned up. Verify that this logic is correctly integrated with the rest of the timeout handling.
Verification successful
Timeout handling correctly deregisters events.
The
NetDelEvent
function is consistently used in the timeout handling logic to deregister events, confirming its correct integration with the rest of the timeout handling process.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that timeout handling correctly deregisters events and cleans up resources. # Test: Search for timeout handling logic. Expect: Consistent use of `NetDelEvent`. rg --type cpp $'keepalive_timeout_'Length of output: 1702
Script:
#!/bin/bash # Search for instances of NetDelEvent to verify its integration with timeout logic. rg --type cpp 'NetDelEvent'Length of output: 2160
Script:
#!/bin/bash # Search for context around NetDelEvent calls in worker_thread.cc to verify association with timeout handling. rg --type cpp --context 5 'NetDelEvent' src/net/src/worker_thread.ccLength of output: 1597
src/pika_repl_client.cc (1)
80-95
: Ensure all asynchronous tasks are completed before stopping.The loop in the
Stop
method ensures that all write tasks are completed before the method exits. This change is crucial for preventing data loss and ensuring data integrity.include/pika_server.h (2)
226-226
: Verify the implementation ofClientKillAllNormal
.Ensure that the
ClientKillAllNormal
method is correctly implemented and integrated within the codebase.Verification successful
ClientKillAllNormal
is implemented and used correctly.The method
ClientKillAllNormal
is implemented insrc/pika_server.cc
and is used insrc/pika_admin.cc
. It callsClientKillAll
onpika_dispatch_thread_
, which suggests it performs the intended function of terminating all normal client connections.
- Implementation:
src/pika_server.cc
- Usage:
src/pika_admin.cc
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of `ClientKillAllNormal`. # Test: Search for the method definition. Expect: Function implementation details. ast-grep --lang cpp --pattern $'void PikaServer::ClientKillAllNormal() { $$$ }'Length of output: 80
Script:
#!/bin/bash # Description: Search for references to `ClientKillAllNormal` in the codebase. # Test: Search for any references to `ClientKillAllNormal` to verify its usage or implementation. rg 'ClientKillAllNormal' -A 5Length of output: 888
225-225
: Verify the implementation ofClientKillPubSub
.Ensure that the
ClientKillPubSub
method is correctly implemented and integrated within the codebase.Verification successful
The
ClientKillPubSub
method is correctly implemented and integrated.
- The method is defined in
src/pika_server.cc
and callsNotifyToCloseAllConns
onpika_pubsub_thread_
.- It is used in
src/pika_admin.cc
, indicating its role in handling administrative commands.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of `ClientKillPubSub`. # Test: Search for the method definition. Expect: Function implementation details. ast-grep --lang cpp --pattern $'void PikaServer::ClientKillPubSub() { $$$ }'Length of output: 77
Script:
#!/bin/bash # Description: Search for any occurrences of `ClientKillPubSub` in the codebase. # Search for the method declaration, definition, and usage. rg 'ClientKillPubSub' -A 5Length of output: 853
src/net/src/net_pubsub.cc (3)
162-174
: Well-implementedCloseAllConns
method.The method effectively clears connection-related data and closes all connections, ensuring thread safety with appropriate locking mechanisms.
608-610
: Efficient signaling inNotifyToCloseAllConns
.The method efficiently uses an atomic boolean to signal the closure of all connections.
431-435
: Improved connection management inThreadMain
.The addition of a signal check for
close_all_conn_sig_
enhances the method's responsiveness to shutdown requests, improving connection management.include/pika_admin.h (1)
239-242
: Verify usage of new kill type members inClientCmd
.Ensure that
KILLTYPE_NORMAL
,KILLTYPE_PUBSUB
, andkill_type_
are correctly utilized in the implementation of theClientCmd
class.Verification successful
New kill type members are correctly utilized in
ClientCmd
.The
KILLTYPE_NORMAL
,KILLTYPE_PUBSUB
, andkill_type_
are properly integrated into theClientCmd
class. They are defined, assigned, and used in conditional logic as expected.
- Usage in
src/pika_admin.cc
:
strcasecmp
checks againstKILLTYPE_NORMAL
andKILLTYPE_PUBSUB
.kill_type_
is assigned fromargv_[3]
and compared with the constants.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of new kill type members in `ClientCmd`. # Test: Search for the usage of `kill_type_`, `KILLTYPE_NORMAL`, and `KILLTYPE_PUBSUB`. Expect: Proper integration in the class methods. rg --type cpp 'kill_type_|KILLTYPE_NORMAL|KILLTYPE_PUBSUB'Length of output: 733
src/pika_server.cc (4)
111-113
: Orderly shutdown improvement.The addition of
pika_dispatch_thread_->StopThread()
in the destructor ensures that the dispatch thread is stopped before the worker thread, promoting orderly shutdown.
858-861
: Enhanced cleanup process inClientKillAll
.The addition of
pika_pubsub_thread_->NotifyToCloseAllConns()
ensures that all publish/subscribe connections are closed when all clients are killed, enhancing the cleanup process.
863-865
: New methodClientKillPubSub
improves clarity.The introduction of
ClientKillPubSub
provides a clear separation of concerns by focusing solely on closing publish/subscribe connections, enhancing code clarity and maintainability.
867-869
: New methodClientKillAllNormal
enhances clarity.The introduction of
ClientKillAllNormal
retains the previous functionality ofClientKillAll
but is explicitly named to indicate its focus on normal client termination, enhancing code clarity.src/pika_admin.cc (4)
687-693
: Enhancement inClientCmd::DoInitial
for type-based client kill.The addition of handling for the
type
argument in the "kill" command enhances the flexibility of client management. Ensure that theKILLTYPE_NORMAL
andKILLTYPE_PUBSUB
constants are correctly defined and used throughout the codebase.
808-809
: Addition ofClientCmd
constants for client types.The constants
KILLTYPE_NORMAL
andKILLTYPE_PUBSUB
provide clear identifiers for client types in the "kill" command. Ensure these constants are consistently used throughout the codebase.Verification successful
Verified usage of
ClientCmd
constants.The constants
KILLTYPE_NORMAL
andKILLTYPE_PUBSUB
are consistently used in the codebase, specifically in logical conditions to differentiate client types for the "kill" command.
- Found in
src/pika_admin.cc
andinclude/pika_admin.h
.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of `KILLTYPE_NORMAL` and `KILLTYPE_PUBSUB` constants. # Test: Search for the usage of these constants in the codebase. rg --type cpp 'KILLTYPE_NORMAL' rg --type cpp 'KILLTYPE_PUBSUB'Length of output: 782
743-752
: Verify server methods for type-based client kill.The logic for handling
KILLTYPE_NORMAL
andKILLTYPE_PUBSUB
in the "kill" command is well-structured. Ensure that theg_pika_server->ClientKillAllNormal()
andg_pika_server->ClientKillPubSub()
methods are correctly implemented and tested.Verification successful
Server methods for client kill types are implemented.
The methods
ClientKillAllNormal
andClientKillPubSub
are implemented insrc/pika_server.cc
and declared ininclude/pika_server.h
. They are correctly invoked insrc/pika_admin.cc
, confirming the logic for handling different client kill types in the "kill" command. The review comment is verified.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implementation of client kill methods for different types. # Test: Search for the implementation of `ClientKillAllNormal` and `ClientKillPubSub`. ast-grep --lang cpp --pattern 'void PikaServer::ClientKillAllNormal() { $$$ }' ast-grep --lang cpp --pattern 'void PikaServer::ClientKillPubSub() { $$$ }'Length of output: 158
Script:
#!/bin/bash # Description: Search for the method names `ClientKillAllNormal` and `ClientKillPubSub` in the codebase using ripgrep. # Test: Search for `ClientKillAllNormal` method. rg 'ClientKillAllNormal' # Test: Search for `ClientKillPubSub` method. rg 'ClientKillPubSub'Length of output: 385
Line range hint
1222-1247
: Improved tracking of replication offsets inInfoCmd::InfoReplication
.The introduction of
total_file_num
andtotal_offset
variables, along with the adjustments toslave_repl_offset
, enhance the accuracy of replication tracking. Ensure that the bounds handling forslave_repl_offset
is correctly implemented and tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- include/pika_server.h (1 hunks)
- src/pika_server.cc (2 hunks)
Files skipped from review as they are similar to previous changes (2)
- include/pika_server.h
- src/pika_server.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- .github/workflows/pika.yml (4 hunks)
Files skipped from review due to trivial changes (1)
- .github/workflows/pika.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- .github/workflows/pika.yml (5 hunks)
Files skipped from review due to trivial changes (1)
- .github/workflows/pika.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- .github/workflows/pika.yml (6 hunks)
- tests/integration/rsync_dynamic_reconfig.go (2 hunks)
Files skipped from review as they are similar to previous changes (1)
- .github/workflows/pika.yml
Additional comments not posted (3)
tests/integration/rsync_dynamic_reconfig.go (3)
33-34
: Verify the impact of reduced key and value sizes.Reducing the key and value sizes from 1024 to 16 will speed up test execution but may not adequately simulate real-world scenarios with larger data. Ensure that the test objectives are still met with these smaller sizes.
Verification successful
Consider the impact of reduced key and value sizes on test objectives.
The reduction in key and value sizes from 1024 to 16 in the
RefillMaster
function is likely intended to improve test execution speed. However, it is important to verify that the test objectives remain aligned with these changes. Consider whether the smaller sizes still adequately simulate the intended scenarios and whether additional tests might be needed to cover cases with larger data sizes.
- Ensure that the test objectives are still met with reduced sizes.
- Consider adding tests for scenarios with larger data sizes if needed.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the impact of reduced key and value sizes in RefillMaster. # Test: Check for any test cases or documentation that mention expected key/value sizes. rg --type go -A 5 'RefillMaster'Length of output: 1687
38-39
: Consider the implications of reduced data volume.The reduction in
dataVolumeMB
from 64 to 1 significantly decreases the volume of data processed during tests. This change might affect the ability to assess performance under load conditions. Ensure that the reduced data volume aligns with the test's goals.Verification successful
Consider the test's objectives with reduced data volume.
The reduction in
dataVolumeMB
from 64 to 1 affects only the specific test intests/integration/rsync_dynamic_reconfig.go
. If the test's goal is to evaluate performance under load, this change might not align with that objective. Ensure the test's purpose is clear and adjustdataVolumeMB
accordingly if load testing is desired.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the implications of reduced data volume in RefillMaster. # Test: Search for references to dataVolumeMB to understand its role in tests. rg --type go -A 5 'dataVolumeMB'Length of output: 1168
Script:
#!/bin/bash # Search for calls to the RefillMaster function to understand its usage. rg --type go 'RefillMaster('Length of output: 107
Script:
#!/bin/bash # Correct the regex pattern and search for calls to the RefillMaster function. rg --type go 'RefillMaster\('Length of output: 234
139-139
: Verify the test case effectiveness with reduced data volume.The test case now uses a significantly smaller data volume. Ensure that the test still effectively evaluates the system's behavior under the intended conditions.
Verification successful
Test case with reduced data volume is appropriate.
The use of a smaller data volume in the
RefillMaster
function is intentional to prevent long refill times, as indicated by the comment in the code. This ensures the test remains efficient without compromising its effectiveness. No further test cases or documentation were found specifying different data volumes.
tests/integration/rsync_dynamic_reconfig.go
: Comment suggests avoiding large data volumes to keep refill times short.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the effectiveness of the test case with reduced data volume. # Test: Check if there are other test cases or documentation that specify the intended behavior for different data volumes. rg --type go -A 5 'RefillMaster(MASTERADDR, 1, ctx)'Length of output: 54
Script:
#!/bin/bash # Broaden the search to find all instances of `RefillMaster` usage in the codebase. rg --type go 'RefillMaster(' -A 5Length of output: 112
Script:
#!/bin/bash # Correct the regex pattern to search for all instances of `RefillMaster` usage in the codebase. rg --type go 'RefillMaster\(' -A 5Length of output: 1177
2 revised network kill logic
2089ccc
to
3a62d50
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- .github/workflows/pika.yml (3 hunks)
Files skipped from review as they are similar to previous changes (1)
- .github/workflows/pika.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/pika_admin.cc (5 hunks)
Additional comments not posted (3)
src/pika_admin.cc (3)
687-693
: Ensure correct handling of client kill types.The
DoInitial
method now supports a new argument for killing clients by type. Ensure that the input validation and error messages are clear and consistent with other command patterns.
Line range hint
1222-1240
: Reviewslave_repl_offset
calculation logic.The calculation of
slave_repl_offset
now includes additional logic. Ensure that this calculation is accurate and does not exceed 32-bit integer limits if that is a requirement.
743-752
: Verify client kill logic for different types.The
Do
method now handles killing clients by type usingClientKillAllNormal
andClientKillPubSub
. Ensure these methods are correctly implemented and do not inadvertently affect the current connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- src/pika_server.cc (9 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/pika_server.cc
2 rename some funs and varibles
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- src/net/include/net_pubsub.h (1 hunks)
- src/net/src/net_pubsub.cc (3 hunks)
- src/pika_server.cc (9 hunks)
Files skipped from review as they are similar to previous changes (2)
- src/net/include/net_pubsub.h
- src/net/src/net_pubsub.cc
Additional comments not posted (3)
src/pika_server.cc (3)
859-862
: LGTM: Comprehensive client termination.The addition of
NotifyCloseAllConns()
ensures that all pub/sub connections are closed, aligning with the goal of comprehensive client termination.
864-865
: LGTM: Focused pub/sub connection termination.The
ClientKillPubSub
function provides a targeted approach for closing pub/sub connections, improving code clarity.
867-869
: LGTM: Explicit naming for normal client termination.The
ClientKillAllNormal
function is clearly named to indicate its purpose, enhancing code readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- src/pika_admin.cc (5 hunks)
- src/pika_repl_client.cc (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- src/pika_repl_client.cc
Additional comments not posted (3)
src/pika_admin.cc (3)
685-693
: Correct handling ofkill
command with type argument.The changes correctly handle the additional "type" argument for the "kill" command, ensuring that only recognized types are processed.
743-752
: Correct execution of client kill operations based on type.The changes correctly implement the logic to execute different client kill operations based on the
kill_type_
variable, with appropriate error handling for unknown types.
Line range hint
1222-1240
: Accurate calculation ofslave_repl_offset
.The changes correctly accumulate binlog offsets to compute the
slave_repl_offset
, ensuring accurate tracking of replication progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- include/pika_admin.h (2 hunks)
- src/pika_admin.cc (5 hunks)
- src/pika_server.cc (9 hunks)
Files skipped from review as they are similar to previous changes (3)
- include/pika_admin.h
- src/pika_admin.cc
- src/pika_server.cc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- include/pika_repl_bgworker.h (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- include/pika_repl_bgworker.h
* 1. make pika support redis sentinel 2. support client kill type pubsub/normal 3. ensure fd is removed in epoll if server wanna close fd * fix exit process: 1. ensure NetWork Thread(Dispacher) can be stopped in time 2. ensure all queued Async WriteDB task can be done before exit --------- Co-authored-by: chejinge <945997690@qq.com>
* 1. make pika support redis sentinel 2. support client kill type pubsub/normal 3. ensure fd is removed in epoll if server wanna close fd * fix exit process: 1. ensure NetWork Thread(Dispacher) can be stopped in time 2. ensure all queued Async WriteDB task can be done before exit --------- Co-authored-by: chejinge <945997690@qq.com>
fix issue #2695 (Pika 3.5.x不支持redis-sentinel)
fix issue #2286 (Pika资源释放流程不正确)
本PR让Pika支持使用redis-sentinel来实现failover。
(测试使用20240807时redis的unstable源码编译的redis-sentinel,距离该版本最近的release版本为7.4.0,但经过实测,一些旧版本的redis-sentinel也是兼容的)
自去年350版本以来,Pika无法兼容Redis-Sentinel主要是以下几个原因:
事务冲突:redis-sentinel选举以后,进行切主操作时会给slave发送slaveof no one命令,但是实际上是以事务的形式发送了下列一批命令:
但Pika实际上不支持client kill type pubsub 与 client kill type normal命令(Doinitial都会失败),所以整个事务会abort不执行,实际的slaveof命令也就根本没有执行,所以无法完成切主。
指标的不兼容:redis-sentinel实际上使用slave_repl_offset指标来在选主时做判断,这个指标反映的是该slave实例总共接受了多少主从同步的字节数(假如是多DB,这个值就是多DB的汇总),而pika的主从体系是每个DB各自独立的,每个db都有一个binlog偏移量,由两个数组成: (filenum, offset),自然也没有提供slave_repl_offset指标供redis-sentinel做判断(以前336版本也兼容redis-sentinel, 但一样没有提供这个指标,有点怀疑之前的选主其实是不可靠的)。
具体地,取一个64位数,让binlog的filenum处于其前32位,让offset处于低32位(单个Binlog文件最大2GB,32位可以表达的值接近4GB,所以不会溢出),最终用这个融合的64位数字输出为salve_repl_offset供redis-sentinel进行选举参考。多DB场景是如何处理的:首先将每个DB的filenum累加,将offset也累加,最后得到一个total filenum和total offset, 此时的offset可能已经大于单个binlog文件的大小,所以再做下面代码段中的这两个处理来将offset中超出binlog file size的部分转移到filenum中,这样就得到了一个融合的filenum, offset,综合反映多个DB的主从进度,最后再按照前述的位运算处理,将其融合成一个整个slave_repl_offset。
slave_repl_offset最终的计算方法是:遍历每个DB, 将binlog_filenum * binlog_file_size,然后累加到slave_repl_offset, 并且将binlog offset(最新binlog文件的内部偏移量)也累加到slave_repl_offset,得到的就是整个实例的同步进度,以字节数计。需要说明的是:binlog filenum是从0开始递增,所以这里直接使用的binlog_filenum相当于真实的binlog文件历史计数减1,恰好符合需求,因为binlog offset反映的就是最后一个Binlog文件的大小。
This PR enables Pika to support using redis-sentinel for failover
Since version 3.5.0 last year, Pika has been incompatible with Redis-Sentinel mainly due to the following reasons:
Transaction conflicts: After the redis-sentinel election, the failover operation sends the
slaveof no one
command to the slave. However, it actually sends a batch of commands in the form of a transaction:Pika does not support the client kill type pubsub and client kill type normal commands (Doinitial will fail), causing the entire transaction to abort. As a result, the slaveof command is not executed, and the failover cannot be completed.
Incompatible metrics: redis-sentinel uses the
slave_repl_offset
metric to determine the master during an election. This metric reflects the total number of bytes received by the slave instance through replication (if multiple databases are involved, this value is the sum for all databases). Pika's replication system is independent for each database, with each DB having its own binlog offset, consisting of two parts: (filenum, offset). Therefore, Pika does not provide aslave_repl_offset
metric for redis-sentinel to use (earlier versions like 3.3.6 also supported redis-sentinel, but did not provide this metric, raising doubts about the reliability of previous elections).slave_repl_offset
for redis-sentinel to use during elections.Handling multiple DB scenarios: First, the filenum and offset of each DB are summed to get a total filenum and total offset. The offset may exceed the size of a single binlog file, so the following code transfers the excess part of the offset to the filenum, resulting in a combined filenum and offset that reflects the replication progress of multiple DBs. Finally, the bitwise operation mentioned above is used to merge them into a single
slave_repl_offset
.slave-priority
provided by Pika Info was changed to 0. The default should be 100, so redis-sentinel can consider the instance as a candidate.dispatcher_thread_
(including dispatcher and worker threads) is now stopped earlier.Summary by CodeRabbit
Summary by CodeRabbit
New Features
ClientKillPubSub
,ClientKillAllNormal
).ClientCmd
class to support different kill types.PikaReplBgWorker
class to retrieve the size of the task queue, aiding performance monitoring.Bug Fixes