-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix underflow possibility on ColumnFamilyDbStore #5975
Conversation
edge-util/src/Microsoft.Azure.Devices.Edge.Storage.RocksDb/ColumnFamilyDbStore.cs
Outdated
Show resolved
Hide resolved
@@ -103,5 +104,53 @@ public async Task MessageCountTest() | |||
Assert.Equal(0ul, await columnFamilyDbStore.Count()); | |||
} | |||
} | |||
|
|||
[Fact] | |||
public async Task MessageCountUnderflowTest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a separate test altogether
edge-util/test/Microsoft.Azure.Devices.Edge.Storage.RocksDb.Test/ColumnFamilyStoreTest.cs
Outdated
Show resolved
Hide resolved
Will fix styling and other stuff once method of fix is approved above :) |
@@ -75,7 +75,20 @@ public async Task Remove(byte[] key, CancellationToken cancellationToken) | |||
|
|||
Action operation = () => this.db.Remove(key, this.Handle); | |||
await operation.ExecuteUntilCancelled(cancellationToken); | |||
Interlocked.Decrement(ref this.count); | |||
// We don't want to wrap around the counter. | |||
var currentCount = Interlocked.Read(ref this.count); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You will need to lock the entire block - if 3 threads read a value (say 2) and all 3 call decrement you will end up with -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I am really trying to avoid the lock.... perhaps the better thing to do is just let it be long and as spoke about offline to make the Metrics count for whatever it really is and just make sure the metric count is checked to make sure we avoid this problem
edge-util/test/Microsoft.Azure.Devices.Edge.Storage.RocksDb.Test/ColumnFamilyStoreTest.cs
Outdated
Show resolved
Hide resolved
Keeping this open until I have result from running for several days |
* Fix underflow * Use compare exchange and make separate test * Use Max for count instead of trying to make atomic check
* Fix underflow * Use compare exchange and make separate test * Use Max for count instead of trying to make atomic check
* Fix underflow * Use compare exchange and make separate test * Use Max for count instead of trying to make atomic check
…6025) * Fix underflow * Use compare exchange and make separate test * Use Max for count instead of trying to make atomic check
* Fix underflow * Use compare exchange and make separate test * Use Max for count instead of trying to make atomic check
This fixes an issue where in certain conditions (EH has disparate info from Checkpointer and MessageStore and can happen when crash occurs before complete updates) it may be possible to remove from the store and go below 0.
Updated existing test to make sure underflow will be caught.
Azure IoT Edge PR checklist:
This checklist is used to make sure that common guidelines for a pull request are followed.
General Guidelines and Best Practices
Testing Guidelines
Draft PRs
Draft
mode if it is:Note: We use the kodiakhq bot to merge PRs once the necessary checks and approvals are in place. When it merges a PR, kodiakhq converts the PR title to the commit title, PR description to the commit description, and squashes all the commits in the PR to a single commit. The net effect is that entire PR becomes a single commit. Please follow the best practices mentioned here for the PR title and description