-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LeafNode: (no)interest propagation issues #6148
Labels
defect
Suspected defect such as a bug or regression
Comments
kozlovic
added a commit
that referenced
this issue
Nov 19, 2024
There were multiple issues, but basically the fact that we would not store the routed subscriptions with the origin of the LEAF they came from made the server unable to differentiate those compare to "local" routed subscriptions, which in some cases (like a server restart and the resend of subscriptions) could lead to servers sending incorrectly subscription interest to leaf connections. We are now storing the subscriptions with an origin with that origin as part of the key. This allows to differentiate "regular" routed subs versus the ones on behalf of a leafnode. An INFO boolean is added `LNOCU` to indicate support for origin in the `LS-` protocol, which is required to properly handle the removal. Therefore, if a route does not have `LNOCU`, the server will behave like an old server, and store with the key that does not contain the origin, so that it can be removed when getting an LS- without the origin. Note that in the case of a mix of servers in the same cluster, some of the issues this PR is trying to fix will be present (since the server will basically behave like a server without the fix). Having a different routed subs for leaf connections allow to revisit the fix #5982 that was done for issue #5972, which was about a more fair queue distribution to a cluster of leaf connections. That fix actually introduced a change in that we always wanted to favor queue subscriptions of the cluster where the message is produced, which that fix possibly changed. With this current PR, the server can now know if a remote queue sub is for a "local" queue sub there or on behalf of a leaf and therefore will not favor that route compared to a leaf subscription that it may have directly attached. Resolves #5972 Resolves #6148 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic
added a commit
that referenced
this issue
Nov 20, 2024
There were multiple issues, but basically the fact that we would not store the routed subscriptions with the origin of the LEAF they came from made the server unable to differentiate those compare to "local" routed subscriptions, which in some cases (like a server restart and the resend of subscriptions) could lead to servers sending incorrectly subscription interest to leaf connections. We are now storing the subscriptions with an origin with that origin as part of the key. This allows to differentiate "regular" routed subs versus the ones on behalf of a leafnode. An INFO boolean is added `LNOCU` to indicate support for origin in the `LS-` protocol, which is required to properly handle the removal. Therefore, if a route does not have `LNOCU`, the server will behave like an old server, and store with the key that does not contain the origin, so that it can be removed when getting an LS- without the origin. Note that in the case of a mix of servers in the same cluster, some of the issues this PR is trying to fix will be present (since the server will basically behave like a server without the fix). Having a different routed subs for leaf connections allow to revisit the fix #5982 that was done for issue #5972, which was about a more fair queue distribution to a cluster of leaf connections. That fix actually introduced a change in that we always wanted to favor queue subscriptions of the cluster where the message is produced, which that fix possibly changed. With this current PR, the server can now know if a remote queue sub is for a "local" queue sub there or on behalf of a leaf and therefore will not favor that route compared to a leaf subscription that it may have directly attached. Resolves #5972 Resolves #6148 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
kozlovic
added a commit
that referenced
this issue
Nov 22, 2024
There were multiple issues, but basically the fact that we would not store the routed subscriptions with the origin of the LEAF they came from made the server unable to differentiate those compared to "local" routed subscriptions, which in some cases (like a server restart and the resend of subscriptions) could lead to servers sending incorrectly subscription interest to leaf connections. We are now storing the subscriptions with a sub type indicator and the origin (for leaf subscriptions) as part of the key. This allows to differentiate "regular" routed subs versus the ones on behalf of a leafnode. An INFO boolean is added `LNOCU` to indicate support for origin in the `LS-` protocol, which is required to properly handle the removal. Therefore, if a route does not have `LNOCU`, the server will behave like an old server, and store with the key that does not contain the origin, so that it can be removed when getting an LS- without the origin. Note that in the case of a mix of servers in the same cluster, some of the issues this PR is trying to fix will be present (since the server will basically behave like a server without the fix). Having a different routed subs for leaf connections allow to revisit the fix #5982 that was done for issue #5972, which was about a more fair queue distribution to a cluster of leaf connections. That fix actually introduced a change in that we always wanted to favor queue subscriptions of the cluster where the message is produced, which that fix possibly changed. With this current PR, the server can now know if a remote queue sub is for a "local" queue sub there or on behalf of a leaf and therefore will not favor that route compared to a leaf subscription that it may have directly attached. Resolves #5972 Resolves #6148 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
derekcollison
added a commit
that referenced
this issue
Nov 22, 2024
There were multiple issues, but basically the fact that we would not store the routed subscriptions with the origin of the LEAF they came from made the server unable to differentiate those compared to "local" routed subscriptions, which in some cases (like a server restart and the resend of subscriptions) could lead to servers sending incorrectly subscription interest to leaf connections. We are now storing the subscriptions with a sub type indicator and the origin (for leaf subscriptions) as part of the key. This allows to differentiate "regular" routed subs versus the ones on behalf of a leafnode. An INFO boolean is added `LNOCU` to indicate support for origin in the `LS-` protocol, which is required to properly handle the removal. Therefore, if a route does not have `LNOCU`, the server will behave like an old server, and store with the key that does not contain the origin, so that it can be removed when getting an LS- without the origin. Note that in the case of a mix of servers in the same cluster, some of the issues this PR is trying to fix will be present (since the server will basically behave like a server without the fix). Having a different routed subs for leaf connections allow to revisit the fix #5982 that was done for issue #5972, which was about a more fair queue distribution to a cluster of leaf connections. That fix actually introduced a change in that we always wanted to favor queue subscriptions of the cluster where the message is produced, which that fix possibly changed. With this current PR, the server can now know if a remote queue sub is for a "local" queue sub there or on behalf of a leaf and therefore will not favor that route compared to a leaf subscription that it may have directly attached. Resolves #5972 Resolves #6148 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
neilalexander
pushed a commit
that referenced
this issue
Nov 22, 2024
There were multiple issues, but basically the fact that we would not store the routed subscriptions with the origin of the LEAF they came from made the server unable to differentiate those compared to "local" routed subscriptions, which in some cases (like a server restart and the resend of subscriptions) could lead to servers sending incorrectly subscription interest to leaf connections. We are now storing the subscriptions with a sub type indicator and the origin (for leaf subscriptions) as part of the key. This allows to differentiate "regular" routed subs versus the ones on behalf of a leafnode. An INFO boolean is added `LNOCU` to indicate support for origin in the `LS-` protocol, which is required to properly handle the removal. Therefore, if a route does not have `LNOCU`, the server will behave like an old server, and store with the key that does not contain the origin, so that it can be removed when getting an LS- without the origin. Note that in the case of a mix of servers in the same cluster, some of the issues this PR is trying to fix will be present (since the server will basically behave like a server without the fix). Having a different routed subs for leaf connections allow to revisit the fix #5982 that was done for issue #5972, which was about a more fair queue distribution to a cluster of leaf connections. That fix actually introduced a change in that we always wanted to favor queue subscriptions of the cluster where the message is produced, which that fix possibly changed. With this current PR, the server can now know if a remote queue sub is for a "local" queue sub there or on behalf of a leaf and therefore will not favor that route compared to a leaf subscription that it may have directly attached. Resolves #5972 Resolves #6148 Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Observed behavior
Probably more cases than described here, but one would be that in a cluster hub/spoke setup, it was possible for interest registered in the leaf cluster to not be fully unregistered when the subscriptions were going away (depending if subscription was removed one by one or if a connection having multiple was closed), and more over, the hub may have incorrectly sent to the leaf cluster an interest on the subject, making publish from the leaf go to the hub where no interest was really present. This would happen if one of the hub server was restarted while subscription was still active, but then the subscriptions were removed.
Expected behavior
Subscription interest properly handled when subscriptions go away, either individually or as a whole when their connection is closed, or when servers are restarted.
Server and client version
Latest
Host environment
N/A
Steps to reproduce
Have queue subs connecting to A2, restart B1, unsubscribe one of the qsub from A2. When that happens, B1 may have sent a subscription interest to A1 (A1's Leafz would show a subscription interest). Removing the remaining subscriptions would not remove the subscription interest from A1->B1.
Test can be repeated where a single connection has all 3 subs and connection is closed.
The text was updated successfully, but these errors were encountered: