-
Notifications
You must be signed in to change notification settings - Fork 100
AMQP session timeout #109
Comments
Here is the list of behaviors enforced in the service side related to connection management for Azure Service Bus (afaik).
To handle this, in https://github.com/Azure/azure-service-bus-go we recover when sending and receiving if we run into an error we'd classify as recoverable. There are some event types which cause Just a thought, but if there were to be a "recovery policy" that could be provided to a connection or some other entity, it might be generic enough to be useful across broker implementations. @amarzavery I know you've been digging into recovery details lately. Please correct me if the details I've provided above are incorrect. |
The details you provided are correct. It would be nice if the protocol library can provide a hook/policy that can help users do some custom things for recovery. For example in Azure EventHubs if a receiver link goes down for some reason and if we decide to recover. We would like to re-establish the link by setting the offset of the last received message as a filter on the link. This enables us to not receive messages from the start or from whatever offset/time that was provided when the link was initially created. Having a callback/hook to do something custom like above during recovery would be extremely beneficial. |
Providing a way to be notified on link detach/allow recovery seems like a good idea. Does someone want to propose an API? I would prefer it to use a callback approach rather than channels. |
Sure I'll take a shot at an API, very rough thinking at the moment but as you have the Here are some very rough (not compiled) thoughts as a first draft. Do you think this is along the right lines? Not set on this approach just getting the ball rolling. // LinkRecoveryFunc is invoked when a link error occurs and
// allows you to create a new link using the newLink func or return an error
// which will be propogated to the sender/receiver next time they are used
type LinkRecoveryFunc func(linkError error, newLink func() (*link, error)) (*link, error)
func LinkRecoveryOption(recoveryFunc LinkRecoveryFunc) LinkOption {
return func(l *link) error {
l.recoveryFunc = func(linkError error) (*link, error) {
return recoveryFunc(linkError, func() (*link, error) { newLink(l.session, l.receiver, l.options) })
}
return nil
}
} When the link experiences an error then it can invoke This could then be used like this when creating a sender: // CreateAmqpSender makes a sender which reconnects when a link detaches
func (l *AmqpConnection) CreateAmqpSender(topic string) (*amqp.Sender, error) {
if l.Session == nil {
log.WithField("currentListener", l).Panic("Cannot create amqp listener without a session already configured")
}
return l.Session.NewSender(
amqp.LinkTargetAddress("/" + topic),
amqp.LinkRecoveryOption(func(linkError error, newLink func() (*link, error){
if isDetachError, _ := err.(amqp.DetachError); isDetachError {
return newLink(), nil
}
return nil, linkError
}),
) |
The electron API (an alternative Go API for AMQP) provides a channel:
https://github.com/alanconway/qpid-proton/blob/cpp-null/go/src/qpid.apache.org/electron/endpoint.go#L58
and an Error() method to get error details - this is consistent for all
endpoints (Connection, Link, Session)
Having written Go APIs that do both, I'd recommend a channel over a
callback. There's some detailed thoughts about why here:
https://github.com/alanconway/qpid-proton/blob/cpp-null/go/examples/README.md#a-tale-of-two-brokers
The short story: callbacks are a return to inversion-of-control, a style of
programming that Go finally allows us to escape from. A single function
that loops and selects over multiple channels is easier to read and
maintain than a scattering of itty-bitty callback functions, each handling
a fraction of the problem, each lacking the overall context of the loop,
and ultimately each needing to communicate results and problems to some
other code via a mechanism such as ... a channel.
…On Tue, Jul 3, 2018 at 4:12 AM, Lawrence Gripper ***@***.***> wrote:
Sure I'll take a shot at an API, very rough thinking at the moment but as
you have the LinkOption already in the API my preference would be to use
this as it will prevent a breaking change and keep things simple.
Here are some very rough (not compiled) thoughts as a first draft. Do you
think this is along the right lines? Not set on this approach just getting
the ball rolling.
// LinkRecoveryFunc is invoked when a link error occurs and // allows you to create a new link using the newLink func or return an error // which will be propogated to the sender/receiver next time they are usedtype LinkRecoveryFunc func(linkError error, newLink func() (*link, error)) (*link, error)
func LinkRecoveryOption(recoveryFunc LinkRecoveryFunc) LinkOption {
return func(l *link) error {
l.recoveryFunc = func(linkError error) (*link, error) {
return recoveryFunc(linkError, func() (*link, error) { newLink(l.session, l.receiver, l.options) })
}
return nil
}
}
When the link experiences an error then it can invoke l.recoveryFunc with
that error and either start using the new link returned or propagate the
error.
This could then be used like this when creating a sender:
// CreateAmqpSender makes a sender which reconnects when a link detachesfunc (l *AmqpConnection) CreateAmqpSender(topic string) (*amqp.Sender, error) {
if l.Session == nil {
log.WithField("currentListener", l).Panic("Cannot create amqp listener without a session already configured")
}
return l.Session.NewSender(
amqp.LinkTargetAddress("/" + topic),
amqp.LinkRecoveryOption(func(linkError error, newLink func() (*link, error){
if isDetachError, _ := err.(amqp.DetachError); isDetachError {
return newLink(), nil
}
return nil, linkError
}),
)
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#109 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHa6XgjACqlNRjCrloEe-jQDuNTlQL7Sks5uCyeKgaJpZM4U_xMz>
.
|
Gahh, sorry - sent links to my devel branch! Here are some more stable
links:
https://github.com/apache/qpid-proton/blob/master/go/examples/README.md#a-tale-of-two-brokers
https://godoc.org/qpid.apache.org/electron#Endpoint
…On Tue, Jul 3, 2018 at 10:44 AM, Alan Conway ***@***.***> wrote:
The electron API (an alternative Go API for AMQP) provides a channel:
https://github.com/alanconway/qpid-proton/blob/cpp-null/go/
src/qpid.apache.org/electron/endpoint.go#L58 and an Error() method to get
error details - this is consistent for all endpoints (Connection, Link,
Session)
Having written Go APIs that do both, I'd recommend a channel over a
callback. There's some detailed thoughts about why here:
https://github.com/alanconway/qpid-proton/blob/cpp-null/go/
examples/README.md#a-tale-of-two-brokers
The short story: callbacks are a return to inversion-of-control, a style
of programming that Go finally allows us to escape from. A single function
that loops and selects over multiple channels is easier to read and
maintain than a scattering of itty-bitty callback functions, each handling
a fraction of the problem, each lacking the overall context of the loop,
and ultimately each needing to communicate results and problems to some
other code via a mechanism such as ... a channel.
On Tue, Jul 3, 2018 at 4:12 AM, Lawrence Gripper ***@***.***
> wrote:
> Sure I'll take a shot at an API, very rough thinking at the moment but as
> you have the LinkOption already in the API my preference would be to use
> this as it will prevent a breaking change and keep things simple.
>
> Here are some very rough (not compiled) thoughts as a first draft. Do you
> think this is along the right lines? Not set on this approach just getting
> the ball rolling.
>
> // LinkRecoveryFunc is invoked when a link error occurs and // allows you to create a new link using the newLink func or return an error // which will be propogated to the sender/receiver next time they are usedtype LinkRecoveryFunc func(linkError error, newLink func() (*link, error)) (*link, error)
>
> func LinkRecoveryOption(recoveryFunc LinkRecoveryFunc) LinkOption {
> return func(l *link) error {
> l.recoveryFunc = func(linkError error) (*link, error) {
> return recoveryFunc(linkError, func() (*link, error) { newLink(l.session, l.receiver, l.options) })
> }
> return nil
> }
> }
>
> When the link experiences an error then it can invoke l.recoveryFunc
> with that error and either start using the new link returned or propagate
> the error.
>
> This could then be used like this when creating a sender:
>
> // CreateAmqpSender makes a sender which reconnects when a link detachesfunc (l *AmqpConnection) CreateAmqpSender(topic string) (*amqp.Sender, error) {
> if l.Session == nil {
> log.WithField("currentListener", l).Panic("Cannot create amqp listener without a session already configured")
> }
>
> return l.Session.NewSender(
> amqp.LinkTargetAddress("/" + topic),
> amqp.LinkRecoveryOption(func(linkError error, newLink func() (*link, error){
> if isDetachError, _ := err.(amqp.DetachError); isDetachError {
> return newLink(), nil
> }
>
> return nil, linkError
> }),
> )
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <#109 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AHa6XgjACqlNRjCrloEe-jQDuNTlQL7Sks5uCyeKgaJpZM4U_xMz>
> .
>
|
What @alanconway suggested is eerily similar to what we've implemented in the Azure Service Bus and Event Hubs libraries for message receivers. // ListenerHandle provides the ability to close or listen to the close of a Receiver
type ListenerHandle struct {
r *receiver
ctx context.Context
}
// Close will close the listener
func (lc *ListenerHandle) Close(ctx context.Context) error {
return lc.r.Close(ctx)
}
// Done will close the channel when the listener has stopped
func (lc *ListenerHandle) Done() <-chan struct{} {
return lc.ctx.Done()
}
// Err will return the last error encountered
func (lc *ListenerHandle) Err() error {
if lc.r.lastError != nil {
return lc.r.lastError
}
return lc.ctx.Err()
} This pattern is helpful for consumers to understand when something has failed, but I don't know that this pattern deals with recovery. This would still leave recovery in the hands of the consumer of the API. If recovery is to be handled effectively, I'd imagine the consumer of the API would need to provide at least two things.
|
On Tue, Jul 3, 2018 at 11:46 AM, David Justice ***@***.***> wrote:
What @alanconway <https://github.com/alanconway> suggested is eerily
similar to what we've implemented in the Azure Service Bus and Event Hubs
libraries for message receivers.
I've seen this pattern in the standard libraries but I'm not long enough at
go to say confidently that it's a "well known pattern". It works nicely to
tie related "waiting" together, e.g. suppose you're reading a channel and
sending messages using the data from the channel. You want to know
immediately if the link closes or fails, even if you're not actively
sending when it fails, and you have a timeout to respect on top of it all:
// NB: This is not vcabbage code! https://godoc.org/qpid.apache.org
select {
case data, ok := <-channelWithData:
if ok {
err := sender.Send(MessageMadeOf(data));
if err != nil { /* handle err */ }
} else {
// no more data
}
case <- sender.Done():
if sender.Error() == nil {
// orderly close
} else {
// link failed, handle sender.Error()
}
case <- time.After(n * time.Millisecond) {
// timeout expired
}
}
Very neat!
// ListenerHandle provides the ability to close or listen to the close
of a Receivertype ListenerHandle struct {
… r *receiver
ctx context.Context
}
// Close will close the listenerfunc (lc *ListenerHandle) Close(ctx context.Context) error {
return lc.r.Close(ctx)
}
// Done will close the channel when the listener has stoppedfunc (lc *ListenerHandle) Done() <-chan struct{} {
return lc.ctx.Done()
}
// Err will return the last error encounteredfunc (lc *ListenerHandle) Err() error {
if lc.r.lastError != nil {
return lc.r.lastError
}
return lc.ctx.Err()
}
https://github.com/Azure/azure-service-bus-go/blob/
607999369044f648929f37d3c925c086b21a5a06/receiver.go#L339-L355
This pattern is helpful for consumers to understand when something has
failed, but I don't know that this pattern deals with recovery. This would
still leave recovery in the hands of the consumer of the API.
If recovery is to be handled effectively, I'd imagine the consumer of the
API would need to provide at least two things.
1. a filter criteria to identify the errors that are recoverable
2. a backoff policy for recovery (linear, exponential...)
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#109 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHa6XibUdlf5HFkH97QO8l2fL20RQ_95ks5uC5HtgaJpZM4U_xMz>
.
|
Exposing channels in a public API can get rather complex. Not so much for a simple "done" channel, since it can be closed and that's it. For other uses it's not so straightforward. (There's a post detailing some examples and guidelines.) Referring to @alanconway's example, what is expected of the consumer when both In summary, I prefer to keep these details internal to the package, exposing a synchronous API where possible. In this case, a synchronous API doesn't make much sense, a callback is the next best option in my opinion. Neither of these approaches prevent the user from creating and using channels themselves, if that's more convenient for them. I'm not saying that this lib can never expose channels, but I'd want to see some reasons as to why a channel is significantly better than the alternatives. |
@lawrencegripper I think creating a new Perhaps modifying the type LinkRecoveryFunc func(linkError error, newLink func(opts ...LinkOption) (*link, error)) (*link, error) When |
On Wed, Jul 4, 2018 at 12:20 PM, Kale Blankenship ***@***.***> wrote:
Exposing channels in a public API can get rather complex. Not so much for
a simple "done" channel, since it can be closed and that's it. For other
uses it's not so straightforward. (There's a post
<https://inconshreveable.com/07-08-2014/principles-of-designing-go-apis-with-channels/>
detailing some examples and guidelines.)
Referring to @alanconway <https://github.com/alanconway>'s example, what
is expected of the consumer when both channelWithData and sender.Done()
can be received from? When multiple select cases can proceed one is chosen
at random (uniformly pseudo-random, to be precise). If channelWithData is
buffered, the client may still want to process any pending messages, making
correct usage more involved. In fact, the internal implementation of
Receiver.Receive has exactly this problem, which needs to be dealt with.
There is sufficient state to deal with that in the Sender. Regardless of
why you wake up you'll get an error from Sender.Send() if it has been
closed. Indeed multiple wake-ups isn't the only way this can happen - it's
perfectly possible for another goroutine to close the Sender after you wake
from select and before you try to Send(). Whether you use select or not you
always need to handle errors in the final Send() call.
Agreed with risks of channels for general use but I think the Done case is
not a bad idea. Of course you can use a callback to close a channel so a
callback approach doesn't preclude waiting concurrently but it makes it
more complex.
… In summary, I prefer to keep these details internal to the package,
exposing a synchronous API where possible. In this case, a synchronous API
doesn't make much sense, a callback is the next best option in my opinion.
Neither of these approaches prevent the user from creating and using
channels themselves, if that's more convenient for them. I'm not saying
that this lib can never expose channels, but I'd want to see some reasons
as to why a channel is significantly better than the alternatives.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#109 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AHa6XszRMY2QnZWsTcjSvODKS0-HR4gzks5uDOtSgaJpZM4U_xMz>
.
|
@vcabbage Ok makes sense to me, I'll take a stab at building a PR out of the proposal, hopefully get some time tomorrow if things go well. |
So it's taken me longer than I'd have liked to wrap my head around the changes needed. Before I dive any further into this I wanted to come up for air and sanity check some stuff @vcabbage. Here are the cases where I think the library should handle a failure gracefully using this method:
The first feels like it should be handled with a separate For the rest I think the existing In terms of where the
I'm a bit unsure how to handle the link change within the Sorry if I'm way off track with this, initially looked at adding the recovery into the mux functions but it looks like sitting them on the higher level session object will be cleaner - any advice thoughts very welcome |
@lawrencegripper There are at least a few places this lib has grown to be a bit unwieldy, sorry about that. I need to do another round of cleanup at some point. I'm not sure I'm willing to introduce a connection recovery mechanism. That seems much more involved than recovering individual links. Though I'm always willing to be convinced that it's the right thing to do if there's a real need. That looks roughly correct for the Sender. A couple notes:
For the Receiver, I think the recovery should happen if |
Hi,
I'm seeing the following when using the library with Azure ServiceBus. The program is making a session and sender and then holding it open, sometime for >10Mins, without sending anything (it's a web app which sends an message on receiving a request). After 10mins idle sending a message fails because the link is detached due to being idle.
I want to add a check which allows me to reconnect when a link is detached but I can't find a channel I can watch for this event on either the sender or session. @devigned I wondered if you saw something similar in your work?
Would this be a useful thing to add as I'd also like to reconnect any
linkdetached
events? I could take a look in making a PR which adds a channel to the session, sound like a good idea?ps. Still relatively new to AMQP so hopefully got terminology right be apologies if I've misunderstood something and this can be worked around with the existing library.
Error message:
The text was updated successfully, but these errors were encountered: