Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[event-hubs] processEvents ignoring maxWaitTime after retryable disconnect event #12278

Closed
chradek opened this issue Nov 4, 2020 · 0 comments · Fixed by #12280
Closed

[event-hubs] processEvents ignoring maxWaitTime after retryable disconnect event #12278

chradek opened this issue Nov 4, 2020 · 0 comments · Fixed by #12280
Assignees
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. Event Hubs

Comments

@chradek
Copy link
Contributor

chradek commented Nov 4, 2020

Summary of issue

The subscription created by EventHubConsumerClient.subscribe() can get into a bad state where the processEvents handler is invoked rapidly with 0 events after a connection is disconnected due to a retryable error. When in this state, the maxWaitTimeInSeconds parameter is ignored and instead the processEvents handler is called as quickly as possible.

Steps to reproduce

The following script reproduces the issue by forcing the underlying connection's idle to fire which triggers a retryable disconnected event:

const { EventHubConsumerClient } = require('@azure/event-hubs');

const connectionString = "";
const eventHubName = "";

const client = new EventHubConsumerClient(EventHubConsumerClient.defaultConsumerGroupName, connectionString, eventHubName);

const clientConnectionContext = client["_context"];

let shouldDisconnect = true;

const subscription = client.subscribe("0", {
  async processError(error, context) {
  },
  async processEvents(events, context) {
    if (shouldDisconnect) {
      clientConnectionContext.connection["_connection"].idle();
    }
    shouldDisconnect = false;
  }
}, {
  maxBatchSize: 100,
  maxWaitTimeInSeconds: 5
});

Root cause

When a connection receives a disconnected event, the event-hubs library creates a new amqp connection and closes the links on the existing connection:

for (const receiverName of Object.keys(connectionContext.receivers)) {
const receiver = connectionContext.receivers[receiverName];
await receiver.close().catch(() => {
/* error already logged, swallow it here */
});

Inside an already running subscription, receiveBatch is called on a loop until either a non-retryable error is encountered, or the subscription is closed:

while (this._isReceiving) {
try {
const receivedEvents = await this._receiver.receiveBatch(
this._processorOptions.maxBatchSize,
this._processorOptions.maxWaitTimeInSeconds,
this._abortController.signal
);

The receiveBatch call does a check to see if the receiver has been closed, and returns any events that have been collected thus far:

if (this._isClosed || this._context.wasConnectionCloseCalled) {
return resolve(receivedEvents);
}

Thus, when a disconnected event is triggered with a retryable error, the loop that calls receiveBatch doesn't exit, and receiveBatch returns immediately each time it is invoked.

Why didn't existing tests catch this?

We do have a test that is supposed to ensure that processEvents is called after a disconnected event occurs:

it("should receive after a disconnect", async () => {

However, it doesn't check that any events are returned on subsequent calls. It also doesn't verify that maxWaitTimeInSeconds is honored. The existing test should be updated to check both of these conditions.

@chradek chradek added bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. Event Hubs labels Nov 4, 2020
@chradek chradek added this to the [2020] December milestone Nov 4, 2020
@chradek chradek self-assigned this Nov 4, 2020
@ghost ghost closed this as completed in #12280 Nov 5, 2020
ghost pushed a commit that referenced this issue Nov 5, 2020
…r retryable disconnect event (#12280)

Fixes #12278 

## Description
This PR adds a check within the `PartitionPump`'s receive events loop to determine whether the receiver has been closed. If the receiver has been closed, it will create a new receiver using an event start position that matches the last event seen by the pump.

The receiver is explicitly closed when a disconnected event is received on the underlying AMQP connection, which causes calls to `receiveBatch` to immediately return any events it had collected up to this point. Note that once the receiver is closed, the receiver's onAmqpMessage handler is removed so it won't receive any additional events.

## Updates to testing
I manually tested the changes in this PR against the sample code in the linked issue.

I also updated the existing `disconnect` test to confirm that
- the `maxWaitTimeInSeconds` is honoured after a disconnected event is encountered.
- new events can be received on subsequent `processEvents` invocations.
openapi-sdkautomation bot pushed a commit to AzureSDKAutomation/azure-sdk-for-js that referenced this issue Jan 6, 2021
Network august release (Azure#12278)

* Adds base for updating Microsoft.Network from version stable/2020-07-01 to version 2020-08-01

* Updates readme

* Updates API version in new specs and examples

* add patch operation for express route gateway (Azure#11553)

* add patch

* fix example

* Added new cloud service NIC and PIP APIs (Azure#11650)

Co-authored-by: Richa Jain <ricjain@microsoft.com>

* Adding support for Vpn Link Connection Mode (Azure#11574)

Co-authored-by: Abhishek Shah <shabhis@microsoft.com>

* Reverting the changes made for address space update as the changes in service code are not in yet (Azure#11754)

Co-authored-by: Hari Prasad Perabattula <haperaba@microsoft.com>

* VPN NAT for Virtual WAN feature changes (Azure#11815)

* VPN NAT for Virtual WAN feature changes

* PrettierCheck fixes

* Incorporate review comments and update examples

* Add edge zone parameters for networking resources and add extendedLocation property to customIpPrefix (Azure#11933)

* Add extendedLocation property to customIpPrefix

* Fix the directory

* Address linting errors

* Fix another linting error

* Add edge zone parameter for network interfaces

* Looks like edgeZone parameter is working when creating network interfaces

* EdgeZone parameter for load balancer

* Add edge zone parameter for public IP address

* Add edge zone parameter for public IP prefix

* Add edgeZone parameter for virtual networks

* Add edge zone parameter for custom IP prefix

Co-authored-by: Will Ehrich <william.ehrich@microsoft.com>

* Add location parameter to Loadbalancer Backend Address Pool Properties Format (Azure#11919)

* adding location parameter to backendaddresspoolpropertiesformat

* ran prettier

* Support for Listing IKE Security Associations for Virtual Network Gateway Connections (Azure#11572)

* Support to List IKE SAs on VNG Connection

* Updating GetIkeSas

* Update virtualNetworkGateway.json

* Added location headers

* Update virtualNetworkGateway.json

* Prettier fix

* Update custom-words.txt

* Update virtualNetworkGateway.json

* Update custom-words.txt

* Update virtualNetworkGateway.json

* Update virtualNetworkGateway.json

* Update virtualNetworkGateway.json

Co-authored-by: Abhishek Shah <shabhis@microsoft.com>

* [Fix] GetIkeSas returns result as string (Azure#12225)

* Removing IkeSaParameters

* Update custom-words.txt

* Update virtualNetworkGateway.json

* Update virtualNetworkGateway.json

* Update VirtualNetworkGatewayConnectionGetIkeSas.json

* Update virtualNetworkGateway.json

* Update VirtualNetworkGatewayConnectionGetIkeSas.json

Co-authored-by: Abhishek Shah <shabhis@microsoft.com>

* Add extended location properties for private link service and private endpoints and remove edge zone properties (Azure#12039)

* Remove edge zone parameter

* Add extended location for private endpoint and private link service

* Add examples

* Capitalization

* Prettier

Co-authored-by: Will Ehrich <william.ehrich@microsoft.com>

* Add missing properties of SecurityRule, Route and RouteTable (Azure#12215)

* Add missing properties of SecurityRule Route and RouteTable

* Set resourceGuid field to be read only

Co-authored-by: Xu Wang <wax@microsoft.com>

* Added placeholder instead of password (Azure#12299)

Co-authored-by: nimaller <71352534+nimaller@users.noreply.github.com>
Co-authored-by: Richa Jain <richa.jain1912@gmail.com>
Co-authored-by: Richa Jain <ricjain@microsoft.com>
Co-authored-by: Abhishek Shah <shah.abhi7860@gmail.com>
Co-authored-by: Abhishek Shah <shabhis@microsoft.com>
Co-authored-by: Hari Prasad Perabattula <harics24@users.noreply.github.com>
Co-authored-by: Hari Prasad Perabattula <haperaba@microsoft.com>
Co-authored-by: Nilambari <nilamd@microsoft.com>
Co-authored-by: William Ehrich <wdehrich@gmail.com>
Co-authored-by: Will Ehrich <william.ehrich@microsoft.com>
Co-authored-by: Kayden Wilkinson <69224099+Kawilki-M@users.noreply.github.com>
Co-authored-by: Xu Wang <wangxu724@gmail.com>
Co-authored-by: Xu Wang <wax@microsoft.com>
@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue requires a change to an existing behavior in the product in order to be resolved. Client This issue points to a problem in the data-plane of the library. Event Hubs
Projects
None yet
1 participant