Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add (optional) retry logic to topology recovery #387

Closed
acogoluegnes opened this issue Aug 3, 2018 · 3 comments
Closed

Add (optional) retry logic to topology recovery #387

acogoluegnes opened this issue Aug 3, 2018 · 3 comments
Assignees
Milestone

Comments

@acogoluegnes
Copy link
Contributor

Topology recovery could benefit from retry logic on failed operations for some specific cases, e.g. failed binding because auto-delete queues has been deleted during the topology recovery.

The retry logic would be:

  • optional
  • configurable to some extent: retry condition (which error, which recoverable entity), retry action, number of retries, backoff policy
@acogoluegnes acogoluegnes self-assigned this Aug 3, 2018
@acogoluegnes acogoluegnes added this to the 4.8.0 milestone Aug 3, 2018
acogoluegnes added a commit that referenced this issue Aug 7, 2018
There's no topology recovery retry by default. The default
implementation is composable: not all have the recoverable entities have
to retry and the retry operations don't have to be only the
corresponding entity recovery, but also other operations, like
recovering the corresponding channel.

Fixes #387
acogoluegnes added a commit that referenced this issue Aug 10, 2018
Instead of 1 by default.

References #387
acogoluegnes added a commit that referenced this issue Aug 10, 2018
There's no topology recovery retry by default. The default
implementation is composable: not all have the recoverable entities have
to retry and the retry operations don't have to be only the
corresponding entity recovery, but also other operations, like
recovering the corresponding channel.

Fixes #387

(cherry picked from commit 34e33ea)
acogoluegnes added a commit that referenced this issue Aug 10, 2018
Instead of 1 by default.

References #387

(cherry picked from commit 9176062)
acogoluegnes added a commit that referenced this issue Aug 10, 2018
There's no topology recovery retry by default. The default
implementation is composable: not all have the recoverable entities have
to retry and the retry operations don't have to be only the
corresponding entity recovery, but also other operations, like
recovering the corresponding channel.

Fixes #387

(cherry picked from commit 34e33ea)

Conflicts:
	src/main/java/com/rabbitmq/client/ConnectionFactory.java
	src/main/java/com/rabbitmq/client/impl/ConnectionParams.java
	src/main/java/com/rabbitmq/client/impl/recovery/AutorecoveringConnection.java
	src/test/java/com/rabbitmq/client/test/TestUtils.java
	src/test/java/com/rabbitmq/client/test/functional/TopologyRecoveryFiltering.java
acogoluegnes added a commit that referenced this issue Aug 10, 2018
References #387

(cherry picked from commit 9711406)
acogoluegnes added a commit that referenced this issue Aug 10, 2018
acogoluegnes added a commit that referenced this issue Aug 10, 2018
@vikinghawk
Copy link
Contributor

vikinghawk commented Aug 10, 2018

I like the approach here and gave the latest milestone a try. Looks like the default of 2 retries was enough to fix my previous tests recovery errors.

Some feedback:

  • It would be nice to have visibility of the entity and exception in the logs when a failure has occurred and its going to retry
    • Log either in DefaultRecoveryHandler or TopologyRecoveryRetryLogic predicates?
    • Or maybe delegate to the connection's ExceptionHandler and let users decide what to log?
  • When the queue not found error happens during consumer recovery, it needs to recover all the bindings on the newly recovered queue as well.
  • Its not possible for users to write their own DefaultRetryHandler.RetryOperation implementations without using reflections to get at the package protected recoverBinding/recoverQueue/etc. methods on AutorecoveringConnection.
  • It would be nice to have a helper method (perhaps on TopologyRecoveryRetryLogic) that creates common scenarios such as the queue not found [1] for users.
    • So all users had to do was something like connectionFactory.setTopologyRecoveryRetryHandler(TopologyRecoveryRetryLogic.retryOnQueueNotFound());
    • This would make it easier and less error prone to consume
    • It also makes it easier to keep code portable between the 4.x and 5.x releases as it would hide the compile time differences such as RetryCondition in 4.x vs the functional predicates used in 5.x. We have apps that need to run on Java 1.7 but others that run on 1.8 and use the micrometer metrics. So currently i compile my code once to 1.7 with the 4.x version of amqp-client but then use 5.x at runtime when running on java 8. The 4.x and 5.x versions have been compatible for everything I am using so far.

[1] https://github.com/rabbitmq/rabbitmq-java-client/blob/master/src/test/java/com/rabbitmq/client/test/functional/TopologyRecoveryRetry.java#L61

acogoluegnes added a commit that referenced this issue Aug 13, 2018
Add log in default retry handler, add operation to recover all the
bindings of a queue (useful when the recovery of a consumer fails
because isn't found), make AutorecoveringConnection#recoverConsumer and
AutorecoveringConnection#recoverQueue public as they contain useful
logic that some client code should be able to use, and declared a
pre-configured retry handler for the deleted queue case.

References #387
acogoluegnes added a commit that referenced this issue Aug 13, 2018
Add log in default retry handler, add operation to recover all the
bindings of a queue (useful when the recovery of a consumer fails
because isn't found), make AutorecoveringConnection#recoverConsumer and
AutorecoveringConnection#recoverQueue public as they contain useful
logic that some client code should be able to use, and declared a
pre-configured retry handler for the deleted queue case.

References #387

(cherry picked from commit 2b8d257)
acogoluegnes added a commit that referenced this issue Aug 13, 2018
Add log in default retry handler, add operation to recover all the
bindings of a queue (useful when the recovery of a consumer fails
because isn't found), make AutorecoveringConnection#recoverConsumer and
AutorecoveringConnection#recoverQueue public as they contain useful
logic that some client code should be able to use, and declared a
pre-configured retry handler for the deleted queue case.

References #387

(cherry picked from commit 2b8d257)

Conflicts:
	src/main/java/com/rabbitmq/client/impl/recovery/DefaultRetryHandler.java
	src/main/java/com/rabbitmq/client/impl/recovery/TopologyRecoveryRetryLogic.java
@acogoluegnes
Copy link
Contributor Author

@vikinghawk Thanks for the feedback. I pushed a commit to address your remarks. Snapshots for 5.x and 4.x are available.

I made only recoverConsumer and recoverQueue public, as they contain logic that is worth re-using. For the other entities, calling RecordedEntity#recover should be enough.

@vikinghawk
Copy link
Contributor

+1, this looks good. I'll try to test again today and let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants