You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When activate() is called in {A,B,C}, then all members connect to the UpgradeServer, register their view and set active=true. This may cause the following issue:
A and B are done, active is true
C is delayed, active is still false and registerView() has not yet been called
C sends a message to B. This succeeds because the message is sent via the JGroups stack (not via UPGRADE, as active==false), and B does receive the message via the JGroups stack. However, B would not be able to send a response to C, because it would send it via UPGRADE. However, C would not receive the message as it hasn't yet called registerView(), which enables the UpgradeServer to send B's response to C, as C doesn't yet use UPGRADE.
We therefore need to ensure that everyone is registered with the UpgradeServer, before switching to use of UPGRADE:
In a first phase, registerView() in all members makes sure that everyone can send/receive message to/from the UpgradeServer.
Only when everyone has successfully registered, we can switch to using UPGRADE by setting active=true. If the first phase doesn't complete successfully, an exception will be thrown and the second phase will not be started, which means that the switch to UPGRADE will not be made.
The second phase does not need to be synchronous: since everyone is connected with the UpgradeServer and JGroups, messages can be sent via JGroups orUPGRADE and will be received all the same! For example, a member might not yet be active, therefore a message is sent via JGroups. The recipient receives it via JGroups, but might send the response via UPGRADE, as it is already active. The original sender will then receive the response via UPGRADE, as it registered with the UpgradeServer in the first phase.
This issue would not cause incorrect behavior in Infinispan, as an RPC would simply time out (e.g. in the above example). However, it reduces the number of failures, which is important when we do a rolling upgrade during heavy traffic.
The same is true for deactivate(): because it is not synchronous (ie., received by all members at the same time), the following can happen:
All members (A,B,C) are active
deactivate() is called
A is the coordinator of the global view and would install the new MergeView locally
However, C receives deactivate() first and disconnects
Because A is still active, it gets a new view {A,B} from the UpgradeServer!
-> We therefore have to activate/deactivate in 2 phases:
Solution for 'activate()`:
All members register with UpgradeServer. Now, they can receive messages either via UpgradeServer or still locally
When this is done (and confirmed): all members switch active to false
** Because this phase is not synchronous, some members might activate before others. However, this is not an issue as members can receive message temporarily through the local channel before switching to UpgradeServer
Solution for 'deactivate()`:
All members set active to false. This means that members send messages via the local channel, but are still able to receive messages via UpgradeServer. However, view changes from UpgradeServer are ignored.
When this is done, members disconnect from UpgradeServer. TBD: we need to make sure that a member has no pending messages sent via UpgradeServer. TBD: perhaps don't disconnect a member at all; when restarted without UPGRADE, the connection to UpgradeServer will be torn down anyway
The text was updated successfully, but these errors were encountered:
When
activate()
is called in {A,B,C}, then all members connect to the UpgradeServer, register their view and setactive=true
. This may cause the following issue:active
istrue
active
is still false andregisterView()
has not yet been calledUPGRADE
, asactive==false
), and B does receive the message via the JGroups stack. However, B would not be able to send a response to C, because it would send it viaUPGRADE
. However, C would not receive the message as it hasn't yet calledregisterView()
, which enables the UpgradeServer to send B's response to C, as C doesn't yet useUPGRADE
.We therefore need to ensure that everyone is registered with the UpgradeServer, before switching to use of
UPGRADE
:registerView()
in all members makes sure that everyone can send/receive message to/from the UpgradeServer.UPGRADE
by settingactive=true
. If the first phase doesn't complete successfully, an exception will be thrown and the second phase will not be started, which means that the switch toUPGRADE
will not be made.The second phase does not need to be synchronous: since everyone is connected with the UpgradeServer and JGroups, messages can be sent via JGroups or
UPGRADE
and will be received all the same! For example, a member might not yet beactive
, therefore a message is sent via JGroups. The recipient receives it via JGroups, but might send the response viaUPGRADE
, as it is already active. The original sender will then receive the response viaUPGRADE
, as it registered with the UpgradeServer in the first phase.This issue would not cause incorrect behavior in Infinispan, as an RPC would simply time out (e.g. in the above example). However, it reduces the number of failures, which is important when we do a rolling upgrade during heavy traffic.
The same is true for
deactivate()
: because it is not synchronous (ie., received by all members at the same time), the following can happen:deactivate()
is calledA
is the coordinator of the global view and would install the new MergeView locallyC
receivesdeactivate()
first and disconnectsA
is still active, it gets a new view{A,B}
from the UpgradeServer!-> We therefore have to activate/deactivate in 2 phases:
Solution for 'activate()`:
active
tofalse
** Because this phase is not synchronous, some members might activate before others. However, this is not an issue as members can receive message temporarily through the local channel before switching to UpgradeServer
Solution for 'deactivate()`:
active
tofalse
. This means that members send messages via the local channel, but are still able to receive messages via UpgradeServer. However, view changes from UpgradeServer are ignored.The text was updated successfully, but these errors were encountered: