Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[neighorch] Replace hasNextHop() asserts with .at() #1894

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

abanu-ms
Copy link
Contributor

@abanu-ms abanu-ms commented Sep 1, 2021

  • Replaced occurences of assert(hasNextHop()) with .at() calls to avoid
    bugs in release

Signed-off-by: Alexandru Banu v-albanu@microsoft.com

What I did
I've changed assert(hasNextHop()) calls with map::at() calls.

Why I did it
When the Debian package is being built, the asserts will be disabled and map::[] calls will create a new element in the map if it doesn't already exist. To prevent such bugs, it's better if we use map::at() calls which will throw an exception, providing the assert() benefits in production releases.

How I verified it
Enforced such a bug in the code, making sure the UTs error out because of the thrown exception, then removed that piece of code.

Details if related

* Replaced occurences of assert(hasNextHop()) with .at() calls to avoid
bugs in release

Signed-off-by: Alexandru Banu <v-albanu@microsoft.com>
@abanu-ms abanu-ms requested a review from prsunny as a code owner September 1, 2021 10:14
@prsunny
Copy link
Collaborator

prsunny commented Sep 3, 2021

How to simulate this issue?. We don't want to crash the process if the entry does not exist. Agree that the assert is a no-op, but if you prefer, you check for hasnexthop and return? What do you think?

@abanu-ms
Copy link
Contributor Author

abanu-ms commented Sep 3, 2021

That would work too, along with an error log at least. The way I see it is that as long as there isn't an external factor that would cause the code to misbehave, I'd rather have the daemon crashing as it means there's some faulty code, as otherwise it may hide the problem.

That's pretty much how I came across this no-op assert issue - I was writing some new code and while testing the code it misbehaved without any obvious reasons. It took a while to figure out that my code was doing something wrong, which would have been easier to diagnose if it crashed straight away.

If you do not want the daemon to that non-fault tolerant, I can change the code as you suggest, along with an error log for easier diagnose.

@prsunny
Copy link
Collaborator

prsunny commented Sep 3, 2021

That would work too, along with an error log at least. The way I see it is that as long as there isn't an external factor that would cause the code to misbehave, I'd rather have the daemon crashing as it means there's some faulty code, as otherwise it may hide the problem.

That's pretty much how I came across this no-op assert issue - I was writing some new code and while testing the code it misbehaved without any obvious reasons. It took a while to figure out that my code was doing something wrong, which would have been easier to diagnose if it crashed straight away.

If you do not want the daemon to that non-fault tolerant, I can change the code as you suggest, along with an error log for easier diagnose.

This is a legacy code and has been working so far. I'm still not sure how you simulated in the normal flow. Prefer to keep it as it is. If you've a sequence to create the failure, let me know.

@abanu-ms
Copy link
Contributor Author

abanu-ms commented Sep 6, 2021

Didn't encounter it in a normal flow. I encountered it when writing some new code (for MPLS) where I was adding/removing new next hops but there was a bug in the code which was trying to remove MPLS next hops that weren't added yet.

I'll change the code as you suggested.

@abanu-ms
Copy link
Contributor Author

abanu-ms commented Sep 6, 2021

@prsunny Can you please check the updates and let me know if it's okay with you?

@abanu-ms
Copy link
Contributor Author

@prsunny Could you re-review please?

@TACappleman
Copy link
Contributor

@prsunny could you have another look at this, following Alex's changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants