Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The topic receiver silently failed. isClosed does not reflect #14535

Closed
bdelville opened this issue Mar 26, 2021 · 7 comments · Fixed by #15098
Closed

The topic receiver silently failed. isClosed does not reflect #14535

bdelville opened this issue Mar 26, 2021 · 7 comments · Fixed by #15098
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus

Comments

@bdelville
Copy link

bdelville commented Mar 26, 2021

  • @azure/service-bus: 7.0.3
  • Operating system: Alpine Linux
  • nodejs: 12.16.3

Describe the bug
The topic receiver silently failed. Then calling isClosed does not return a disconnected status

To Reproduce
Steps to reproduce the behavior:

Hard to reproduce as it happened without anything special after running for 16 days on Kubernetes.
Actually running with full Azure logs activated to provide extra traces.

Expected behavior

If the receiver get disconnected, either call the MessageHandlers.processError or return true for receiver.isClosed.
This issue was not happening with version 1.x.

@ghost ghost added needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. customer-reported Issues that are reported by GitHub users external to the Azure organization. question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Mar 26, 2021
@ramya-rao-a ramya-rao-a added Client This issue points to a problem in the data-plane of the library. Service Bus labels Mar 26, 2021
@ghost ghost removed the needs-triage Workflow: This is a new issue that needs to be triaged to the appropriate team. label Mar 26, 2021
@ghost ghost added the needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team label Mar 26, 2021
@ramya-rao-a
Copy link
Contributor

Hey @bdelville

Apologies for the late response here.

Actually running with full Azure logs activated to provide extra traces.

Do you mean to say that you are now running with all logs activated?

Without client side logs, there is not much we can do.

If you have the timestamp on when this happened, you can log an Azure support case where the support folks can then look at service side logs to see if there was anything unexpected that happened during that time on your Service Bus instance.

For now we will close this issue.
Please feel free to open a new one if you have client side logs for the next time this happens or open an Azure support case to investigate the last occurance.

@bdelville
Copy link
Author

I was saying that we are running the instance with debug logs activated for "rhea" and "azure".
However this is still happening. I will re-open once we can give you more traces.

@ramya-rao-a
Copy link
Contributor

Thanks @bdelville

When you get the logs, please open an Azure support ticket with the logs, the timestamp for when you see the issue, your service bus instance details, topic and subscription name. This will allow the support folks to query the service side logs to look for any abnormalities. Also ask them to loop in the Azure SDK team to look at the client side logs that you would be sharing. Feel free to use this issue as a reference.

@bdelville
Copy link
Author

Hello, we reproduced and now have the logs.
However this is something that never occurred with the library version 1.x, so the issue is certainly linked in some way with the library.
topic-stream-error.log

@ramya-rao-a
Copy link
Contributor

Thanks for the logs @bdelville

We can start looking into this to look for any client side issues, but I highly recommend opening an Azure support ticket with the details listed in my previous comment #14535 (comment) to rule out service side issues as well as to support you better.

@ramya-rao-a ramya-rao-a reopened this Apr 30, 2021
@bdelville
Copy link
Author

Thanks for looking into it,
I will also start a ticket in Azure too, and link to this ticket

richardpark-msft added a commit that referenced this issue May 8, 2021
This PR has a few changes in it, primarily to improve our robustness and our reporting:

General reliability improvements:
- Migrates to a workflow that treats subscription start as a retryable entity, rather than just link creation (which is what had previously). 
- It checks and throws exceptions on much more granular conditions, particularly in addCredit
- Error checking and handling has been migrated to be in far fewer spots and to be more unconditional, which should hopefully eliminate any areas where an exception or error could occur but it never gets forwarded or seen.

SDK debugging:
- Adds a new SDK only flag (forwardInternalErrors) which will make it so areas that used to eat errors now can forward them to processError. Prior to this the errors were only logged, but that meant they could be missed. Most of these would probably be considered cosmetic by customers so this is primarly for debugging purposes within the SDK itself.
- The internal `processInitialize` handler has been split into two (still internal) handlers - preInitialize and postInitialize. preInitialize runs before init(), and postInitialize runs after init() but before addCredit. This lets us write more reliable tests. These are not exposed to customers.

Fixes #14535
@richardpark-msft
Copy link
Member

Hi @bdelville , we've recently released a fix that should address the error message you saw by being more aggressive about retries:

https://www.npmjs.com/package/@azure/service-bus/v/7.1.0

@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that Service Bus
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants