Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected TIMEOUT WAITING FOR ACK with CachingConnectionFactory and correlated publisher-confirms #2907

Closed
kjastrzebski opened this issue Nov 18, 2024 · 2 comments

Comments

@kjastrzebski
Copy link

In what version(s) of Spring AMQP are you seeing this issue?

It started to happen since SB3 upgrade.

Describe the bug

I am using CachingConnectionFactory, with separate connections for publishing and consuming. There is only a single thread publishing to Rabbit. I have CORRELATED publish-confirm enabled.

Most of the time it works well, however with increased load on the broker and its slower responses, I sometimes get TIMEOUT WAITING FOR ACK exception from CorrelationData future, always ~10s after publish operation completes.

To Reproduce

The only way I managed to reproduce problem is to "spam" broker with publish operations, so it takes longer than 10s to confirm the message. I do not set timeout on CorrelationData future.

Expected behavior

I expect CorrelationData future to wait for ack as long as I configure it in my code.

Sample

I do not have sample but I can share my findings.
I believe that with SB3 cached channels maintenance has changed.

Since SB3, post publish() operation, when channel is returned to cache, there is logicalClose() executed on channel's proxy.

If correlated publisherConfirms is enabled and acks are not received immediately then spring amqp code calls publisherCallbackChannel.waitForConfirms(ASYNC_CLOSE_TIMEOUT); where private static final int ASYNC_CLOSE_TIMEOUT = 5_000;

When this call times out, physicalClose() is executed, which in case of correlated publisherConfirms enabled, calls asyncClose() which then calls channel.waitForConfirmsOrDie(ASYNC_CLOSE_TIMEOUT); on the channel.

When ack does not come back in next 5s (10s in total), then channel is forcefully closed with TIMEOUT WAITING FOR ACK error: waitForConfirmsOrDie(ASYNC_CLOSE_TIMEOUT)

I can understand that to avoid cached channels leakage, some timeout is required however IMO it should not be hardcoded, but rather made configurable with some reasonable default for most of the users.

I checked source code for SB2 spring-amqp module and I do not see this channel close logic there, it waits for acks as long as necessary.

It would be nice to extend this timeout, and/or at least make it configurable, so user can decide how long they want to wait for the publish confirmation by calling .get(timeout) on CorrelationData future.

For the time being, I am going to switch to ThreadChannelConnectionFactory and simple publisherConfirms approach.

@artembilan artembilan added this to the 3.2.0 milestone Nov 18, 2024
spring-builds pushed a commit that referenced this issue Nov 18, 2024
Fixes: #2907

Issue link: #2907

The current hard-coded `5 seconds` is not enough in real applications under heavy load

* Fix `CachingConnectionFactory` to use `getCloseTimeout()` for `publisherCallbackChannel.waitForConfirms()`
which is `30 seconds` by default, but can be modified via `CachingConnectionFactory.setCloseTimeout()`

(cherry picked from commit 562bc77)
@OLibutzki
Copy link

In addition to being able to adjust the timeout, this change increases the default value from 5 to 30 seconds, right?

@artembilan
Copy link
Member

That’s correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants