Skip to content

Spring Websockets Broker relay supporting active/standby broker #26169

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
kmandalas opened this issue Nov 27, 2020 · 13 comments
Closed

Spring Websockets Broker relay supporting active/standby broker #26169

kmandalas opened this issue Nov 27, 2020 · 13 comments
Labels
in: messaging Issues in messaging modules (jms, messaging)

Comments

@kmandalas
Copy link

kmandalas commented Nov 27, 2020

In order to support Broker Cluster setups and according to Connecting to a Broker we need to configure a supplier of addresses, instead of a fixed host and port.

However if we have an active/standby setup or in other words High Availability (HA) and Failover cluster (example: https://activemq.apache.org/components/artemis/documentation/1.0.0/ha.html) where only one of the broker instances is active at any time, is this approach still applicable? How will the ReactorNettyTcpClient behave in case of a Broker cluster node failure in general? It is mentioned that the STOMP broker relay always connects, and reconnects as needed if connectivity is lost, to the same host and port. In case of multiple addresses or an active/passive setup, if the TCP connection fails then an attempt is made against the next available node in the list?

@rstoyanchev Adding to the above questions just to note that AbstractWebSocketMessageBrokerConfigurer is Deprecated since 4.0.1 so perhaps an update in the reference docs is needed along with a more thorough explanation on what can be supported out of the box VS the custom implementation needed in production environments where connecting to a cluster is usually expected.

@rstoyanchev
Copy link
Contributor

rstoyanchev commented Nov 30, 2020

@kmandalas the sample in the docs is accurate but could use an update to take into account deprecations:

@Configuration
@EnableWebSocketMessageBroker
public class WebSocketConfig implements WebSocketMessageBrokerConfigurer {

	// ...

	@Override
	public void configureMessageBroker(MessageBrokerRegistry registry) {
		registry.enableStompBrokerRelay("/queue/", "/topic/").setTcpClient(createTcpClient());
		registry.setApplicationDestinationPrefixes("/app");
	}

	private ReactorNettyTcpClient<byte[]> createTcpClient() {
		return new ReactorNettyTcpClient<>(
				client -> client.remoteAddress(() -> {
					// return the address to connect to per subscribe
				}),
				new StompReactorNettyCodec());
	}
}

I'm not sure how this relates to an ActiveMQ HA and Failover cluster but in the above example you configure the client with a Supplier<SocketAddress> which is used on every connect. You have to somehow decide what address to provide on each call of Supplier#get().

@rstoyanchev rstoyanchev added the status: waiting-for-feedback We need additional information before we can continue label Nov 30, 2020
@kmandalas
Copy link
Author

@rstoyanchev thanks for the feedback. Regarding ArtemisMQ HA, there is a "master" node and "backup" nodes. Lets assume 1 backup node for simplicity. Only if the master nodes becomes unavailable, connection to backup node should be made. If the master becomes available again, it takes over. I think its clear that custom implementation is needed, so this answers my question.

@rstoyanchev rstoyanchev added in: messaging Issues in messaging modules (jms, messaging) and removed status: waiting-for-feedback We need additional information before we can continue status: waiting-for-triage An issue we've not yet triaged or decided on labels Nov 30, 2020
@kmandalas
Copy link
Author

kmandalas commented Dec 1, 2020

@jbertram would you mind to advise on this just what would be a proper approach based on your experience with ActiveMQ Artemis? I need to implement this for my current project production usage, just checking if later it would be meaningful to contribute something as a more generic feature (although is difficult since at this context we need to be Broker independent). However maybe some generic policies would be meaningful if could be provided, for example: a RoundRobinPolicy or ActivePassivePolicy?

@jbertram
Copy link

jbertram commented Dec 1, 2020

I'm not sure you'd need something specific to ActiveMQ Artemis. Ideally you'd use a "sticky" first-available implementation where you supply a list of potential addresses and then you "stick" to the first one that works and keep using that until it no longer works. Once you get a failure then you go back and start from scratch trying each address in the list until you find one that works. This kind of implementation should work for any active/passive broker setup. I think implementing a handful of generic policies (this being one of them) is a good idea.

In any event, the STOMP specification doesn't define semantics for any kind of failover of clustering so it's really up to the client and broker implementations to sort this stuff out.

@kmandalas
Copy link
Author

kmandalas commented Dec 1, 2020

Thanks for your input @jbertram

@rstoyanchev do you think that there is room for opening a discussion for such an Enhancement issue?

@rstoyanchev
Copy link
Contributor

We can continue the discussion here. What enhancement do you have in mind?

@kmandalas
Copy link
Author

@rstoyanchev to provide default implementations for a set of "policies" like: RoundRobin, ActivePassive (and perhaps a couple of more generic ones) where a list of broker URLs/ports are provided in properties, along with the selected policy. Configuring a client with a Supplier to remain as a backup if none of the out-of-the-box policies fit.

@rstoyanchev
Copy link
Contributor

rstoyanchev commented Dec 3, 2020

@kmandalas I think this could be helpful. If you come up with some useful hierarchy like that feel free to suggest it with a PR. I would add that this will all be on the level of Reactor Netty APIs and such would belong better in Reactor Netty.

@kmandalas
Copy link
Author

kmandalas commented Dec 21, 2020

@rstoyanchev From my point of view I think it would be good to provide a couple of production-ready options when using WebSockets with External Message Broker. Production-ready in a sense that usually on Production there is no standalone broker but a cluster. This is why I did not open the issue under Reactor Netty. Its in my plans to proceed with a custom approach for the current project I am working on based on @jbertram suggestion above. As soon as we are ready I will try to resume the discussion here or ask for some guidance if there is possibility for a PR.

@kmandalas
Copy link
Author

@rstoyanchev @jbertram @garyrussell As I am trying to explore Spring Boot support for Message Broker Clusters in general (well at least for RabbitMQ, ActiveMQ and ArtemisMQ), I run into this: spring-projects/spring-amqp#1303

Additionally for ArtemisMQ, recently an enhancement was introduced (see spring-projects/spring-boot#10739) allowing in practice to connect by providing a connection String like:

spring.artemis.url=(tcp://localhost:61616,tcp://localhost:61716)?ha=true&retryInterval=1000&retryIntervalMultiplier=1.0&reconnectAttempts=-1

So my thinking is, could it be possible to connect in the same way to a broker when configuring it for WebSockets usage?

@jbertram
Copy link

jbertram commented Feb 3, 2021

...could it be possible to connect in the same way to a broker when configuring it for WebSockets usage?

Are you asking in general or about the specific broker integrations?

WebSockets are a general purpose mechanism and as far as I'm aware there is no specification which defines a multi-endpoint URL like what you're after. I think you're just going to have to invent a new format or adapt an existing one and then your WebSocket client implementation will have to parse that format and deal with failures, reconnecting, etc. manually.

Also, FWIW, the designation "ArtemisMQ" is incorrect. It should be "ActiveMQ Artemis." I don't mean to nit-pick, but naming is important.

@dileeproopreddy
Copy link

We have similar ActiveMQ setup in AWS (Active/Standby). Tech stack is ReactJS with Spring Boot (Websockets) behind. We were able to pass the two urls of Active/Standby and determine BrokerUnavailabilityEvent and toggle the url for reconnection. This would mean, existing connections( connected to Active node) will be dropped and a fresh Websocket request has to be made (by a browser page refresh etc from UI) to get it connected to Standby node. However, there is a limitation for this not to happen. Does anyone has a breakthrough to find a Netty code change to seamlessly reconnect without closing existing connection( Spring executes onComplete and runs afterConnectionClosed() is the main caveat).

@rstoyanchev
Copy link
Contributor

rstoyanchev commented Feb 6, 2024

@dileeproopreddy, the general issue of maintaining broker connectivity is non-trivial. Even with active/standby, on occasion the lack of connectivity may be longer than expected, and that becomes difficult to manage.

The system connection is the only one that originates from the server, and therefore must also be re-established by the server. All other connections to the broker are on behalf of WebSocket clients. In effect the server acts as a proxy between WebSocket clients and the message broker, also propagating heartbeats that are used by both sides to decide when the connection is lost. When the broker connections are lost, those proxy links are broken, and this is propagated to clients to decide what to do next.

Propagating the disconnect back to the origin distributes the effort of reconnecting. It also allows each client to make decisions independently about when to reconnect, how many times to try, and it also stops sending messages in the mean time. It would be non-trivial for the proxy server to buffer messages while the broker is unavailable, especially if the broker remains unavailable for longer. There are no good options for what to do with such accumulated messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: messaging Issues in messaging modules (jms, messaging)
Projects
None yet
Development

No branches or pull requests

5 participants