-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reuse NF instance IDs #92
Conversation
CI MessageYour results will arrive shortly |
CI MessageRun successful see results: Linter Passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested by setting MAX_NFs to 4 and running multiple nf's + terminating. NF instance ID's wrap around as expected.
-- Edit: Bug occurs when attempting to terminate first NF initialized
-
Sanity checks
NF and onvm compiles and runs properly. -
Code design
Code is clean and concise. Functionality of assigning previous instance ID's is easily understandable from the code. -
Documentation
Documentation within the code is concise and understandable.
This is indeed a weird issue, I'll check later. Any ideas why this happens? |
I am looking into the code - I will keep you updated if I find something funky Edit: I've been trying to replicate this bug again; but I'm having trouble. I follow @kevindweb steps, and assure that the ID we are trying to kill is proper by inserting a print statement: However I can easily kill the first NF running, and I am assuring that the new reuse ID function is being used. Additionally, I am making sure to test with setting MAX_NFs to 4 right away, instead of setting to 2 first then attempting to replicate. Exact steps:
|
Agree with @dennisafa I can't seem to reproduce the behavior, @kevindweb are there any key steps we're missing here? |
Dennis told me he was able to reproduce it the same way I explained. First, I changed MAX_NFS to 2, and started 2 speed_testers, this created an error with the first initialized speed_tester. Then I did the same thing with MAX_NFS=4. I don't know exactly why this happens though. |
Did you |
I just tested it the same way with cleaning and making /examples and /onvm again |
Not sure why I can't reproduce, I'll take a look at the lab |
Strangely, this issue pops up only every once in a while. In a few cases, it'll occur on the second or third NF initialized. Could this be an issue unrelated to this specific PR? I would understand if the instance_id variable for the first NF was somehow corrupted when getting the error code (because then we couldn't clean up properly) but it's valid every time I'm testing it.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, my first successful test of this PR!! I stress tested this a lot because I was being cautious. For a few minutes I consistently added and dropped new speed_testers, was not allowed to start more than MAX_NFS
, correctly! The instance IDs wrapped around perfectly, even after multiple iterations of stopping and starting NFs. Success A#1!
@dennisafa @kevindweb maybe you guys where onto something. I found a nasty bug regarding this PR. When running different NFs and reusing the old instance_id they would segfault on pthread_join. After some gdb debugging and being very confused I have found the reason. We never cleaned up old function pointers in the |
@onvm go for it |
CI MessageYour results will arrive shortly |
CI MessageRun successful see results: Linter Failedexamples/aes_decrypt/aes.h:176: #endif line should be "#endif // AES_H" [build/header_guard] [5] |
Instead of stopping when we reach MAX_NFS, wrap back to the initial instance ID starting value.
Summary:
Reuse instance IDs of old NFs that have terminated.
I've initially implemented an inline function for the while loop so we don't use the code twice, but I revised this as I think the 2 loops with comments just look cleaner.
Usage:
I tested with decreasing MAX_NFS number to 4, seems to work didn't test all the small details yet.
Merging notes:
TODO before merging :
Test Plan:
Try to break this.
Review:
TBA