-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Service Bus] receiveMessages in queue receiver in peekLock mode will cause message lost(lock) near error msg "Received transfer when credit was 0". #15606
Comments
Hello @enoeden, I apologize for the late reply on this issue. That error message is a big tell and it's given me a good area to look at, so I appreciate you providing so much information here. I will be looking at this more tomorrow morning. Just a couple of observations/questions:
|
Hi @richardpark-msft , For question1, our use case here a is a task dispatch server which store tasks in message queue, we built a thin wrapper around service bus and aws sqs(for cross platform) and expose task retrieve api with features like long pulling (maxWaitTime) and batch receive. We create one receiver per request and close it after one time receive is just want to simplify the implmentation on server side and when it comes to receiver reuse in our case, we may have concern about fault isolation and receive performance. With patch below to ext-store.js we do reuse the receiver and some weird behavior just stopped us, correct me if any misunderstand. @@ -43,10 +43,10 @@
}
const release = await this.mutex.acquire();
try {
- let queueReceiver = this.serviceBusClient.createReceiver(`${queueName}`, { maxAutoLockRenewalDurationInMs: 0});
- //let queueReceiver = this.getReceiver(queueName);
+ //let queueReceiver = this.serviceBusClient.createReceiver(`${queueName}`, { maxAutoLockRenewalDurationInMs: 0});
+ let queueReceiver = this.getReceiver(queueName);
let msgs = await queueReceiver.receiveMessages(parseInt(num), {maxWaitTimeInMs: pullDuration * 1000});
- await queueReceiver.close();
+ //await queueReceiver.close();
return msgs;
}catch (e) {
this.logger.error(`${e}`);
@@ -59,10 +59,10 @@
const release = await this.mutex.acquire();
try{
this.logger.debug(`azure delMessage ${msg} from ${queueName}`);
- let queueReceiver = this.serviceBusClient.createReceiver(`${queueName}`, { maxAutoLockRenewalDurationInMs: 0});
- //let queueReceiver = this.getReceiver(queueName);
+ //let queueReceiver = this.serviceBusClient.createReceiver(`${queueName}`, { maxAutoLockRenewalDurationInMs: 0});
+ let queueReceiver = this.getReceiver(queueName);
await queueReceiver.completeMessage(msg);
- await queueReceiver.close();
+ //await queueReceiver.close();
}catch(error){
this.logger.info(error);
} finally {
|
@enoeden - now that I understand your use case it makes a lot of sense. Your model makes a lot of sense. As a status update - I'm still trying to narrow down the bug somewhat. I was able to reproduce the condition that leads to the |
…ete (#15989) Fixing an issue where we could lose messages or provoke an alarming message from rhea (`Received transfer when credit was 0`) The message loss issue is related to how we trigger 'drain' using 'addCredit(1)'. Our 'receiver.drain; receiver.addCredit(1)' pattern actually does add a credit, which shows up in the flow frame that gets sent for our drain. This has led to occasionally receiving more messages than we intended. The second part of this was that we were masking this error because we had code that specifically threw out messages if more arrived than were requested. If the message was being auto-renewed it's possible for the message to appear to be missing, and if we were in receiveAndDelete the message is effectively lost at that point. That code is now removed (we defer to just allowing the extrra message, should a bug arise that causes that) and we log an error indicating it did happen. The rhea error message appeared to be triggered by our accidentally allowing multiple overlapping 'drain's to occur (finalAction did not check to see if we were _already_ draining and would allow it to happen multiple times). Removing the concurrent drains fixed this issue but I didn't fully investigate why. Fixes #15606, #15115
Hi @enoeden, a fix for this has been released as part of https://www.npmjs.com/package/@azure/service-bus/v/7.3.0 |
Describe the bug
similar as #12711 test scenario, we found msg lost(lock) when call receiveMessages in queue receiver in peekLock mode. and there is wording "Received transfer when credit was 0" around the lost point which seems related.
To Reproduce
Steps to reproduce the behavior:
package.json
ext-store.js
testqueue.js
sender.js
Steps to reproduce the behavior:
we have a thin wrapper around SDK for multi-platform support while we think is simple enough.
in test case, we send 2000 msg via sender.js and then call testqueue.js to receive, queue is configured with 2min lock duration and Max delivery count 1 to make it more easy to reproduce.
in our test case we can found
cat 1623131203.new.log | grep -e === -e sum
which seems 2 msg lost in this round, as we mark every msg from 0 to 2000, and set DEBUG=* for detailed log
after brief log analysis,
msg 189, 190 are lost from application's view
we can still find raw msg info 189 in sdk's detailed log,
there are "Received transfer when credit was 0" before msg 191 received
Expected behavior
We except no message lost(lock) in such case.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
1623131203.new.zip
The text was updated successfully, but these errors were encountered: