Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NF core reassignment on termination #87

Merged
merged 14 commits into from
Aug 3, 2019

Conversation

koolzz
Copy link
Member

@koolzz koolzz commented Mar 21, 2019

Adds functionality to onvm_mgr to reassign cores on NF shutdown.

Summary:

Reallocates a NF to another core when another NF shuts down and a better core allocation is possible.
This is a part of multiple changes regarding the autoscaling and core managing improvments for ONVM, working with @ratnadeepb on these.

Usage:
Run onvm_mgr with 2 cores for NFs, f.e:
./go.sh 0,1,2 0 0x60
Run basic monitor (or any other NF really)
./go.sh 2
Run basic monitor in shared more (or any other NF really)
./go.sh -l 0 -- -r 1 -s --
Run another basic monitor in shared more (or any other NF really)
./go.sh -l 0 -- -r 1 -s --
*Then Ctrl-C the first NF (the one not running in shared mode)
The second NF should now be reallocated to the core that the first NF was running on.

This PR includes
Resolves issues
Breaking API changes
Internal API changes
Usability improvements
Bug fixes
New functionality 👍
New NF/onvm_mgr args
Changes to starting NFs
Dependency updates
Web stats updates

Merging notes:

  • Dependencies: None

TODO before merging :

  • PR is ready for review

Test Plan:

Figure out if this can break anything.

Review:

Review checklist:

  • Sanity checks, assigned to @dennisafa @rskennedy

    • Run make in /onvm and /examples directories
    • Test the new functionality
  • Code style, assigned to @dennisafa @rskennedy

    • Run linter
    • Check everything style related
  • Code design, assigned to @dennisafa @rskennedy

    • Check for memory leaks
    • Check that code is reusable
    • Code is clean, functions are concise
    • Verify that edge cases properly handled
  • Performance, assigned to @dennisafa @rskennedy

    • Run Speed Tester NF, report performance.
    • Run pktgen, report performance (not really required)
  • Documentation, assigned to @dennisafa @rskennedy

    • Check if the new changes are well documented, in both code and READMEs

@onvm
Copy link

onvm commented Mar 21, 2019

In response to PR creation

CI Message

Your results will arrive shortly

@onvm

This comment has been minimized.

@koolzz
Copy link
Member Author

koolzz commented Mar 21, 2019

PR review assigned to @dennisafa @rskennedy

@koolzz
Copy link
Member Author

koolzz commented Mar 23, 2019

@onvm hello my old friend

@onvm
Copy link

onvm commented Mar 23, 2019

@onvm hello my old friend

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Mar 23, 2019

@onvm hello my old friend

CI Message

Run successful see results:
[Results from nimbnode30]
Median TX pps for Speed Tester: 34822438

Linter Passed

Copy link
Member

@dennisafa dennisafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just some style nits

@@ -585,7 +585,14 @@ onvm_nflib_handle_msg(struct onvm_nf_msg *msg, __attribute__((unused)) struct on
RTE_LOG(INFO, APP, "Received scale message...\n");
onvm_nflib_scale((struct onvm_nf_scale_info *)msg->msg_data);
break;
case MSG_NOOP:
case MSG_CHANGE_CORE:
RTE_LOG(INFO, APP, "Recieved relocation message...\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: Received

onvm_threading_find_nf_to_reassign_core(uint16_t candidate_core, struct core_status *cores) {
int i;
int candidate_nf_id, most_used_core;
int max_nfs_per_core;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe combine all these into one?
int i, candidate_nf_id, most_used_core, max_nfs_per_core;

int
onvm_threading_find_nf_to_reassign_core(uint16_t candidate_core, struct core_status *cores);


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style nit: remove line

Copy link
Member

@dennisafa dennisafa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Sanity checks

    • Making in ONVM and examples works properly. Core is reassigned upon exiting NF as test plan described.
  • Code style

    • Minor style nits
  • Code design

    • Variables are understandably named, design is clean and concise. No recommended design changes.
  • Performance, assigned to @dennisafa @rskennedy

    • Speed_tester runs properly at avg speed of 56m pps.
  • Documentation

    • onvm_threading.h is documented properly.

* Output : an error code
*
*/
inline int
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

95: Line ends in whitespace.


inline int onvm_nf_relocate_nf(uint16_t dest, uint16_t new_core) {
uint16_t *msg_data;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

314: Line ends in whitespace.

@@ -585,7 +585,14 @@ onvm_nflib_handle_msg(struct onvm_nf_msg *msg, __attribute__((unused)) struct on
RTE_LOG(INFO, APP, "Received scale message...\n");
onvm_nflib_scale((struct onvm_nf_scale_info *)msg->msg_data);
break;
case MSG_NOOP:
case MSG_CHANGE_CORE:
RTE_LOG(INFO, APP, "Received relocation message...\n");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

589: Line ends in whitespace.

case MSG_NOOP:
case MSG_CHANGE_CORE:
RTE_LOG(INFO, APP, "Received relocation message...\n");
RTE_LOG(INFO, APP, "Moving NF to core %d\n", *(uint16_t *)msg->msg_data);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

590: Line ends in whitespace.

@koolzz
Copy link
Member Author

koolzz commented Mar 28, 2019

@dennisafa implemented requested changes, approve if everything else is good.

@koolzz
Copy link
Member Author

koolzz commented Mar 28, 2019

@rskennedy give this a review when you can

dennisafa
dennisafa previously approved these changes Mar 28, 2019
@koolzz
Copy link
Member Author

koolzz commented Apr 1, 2019

The more I though about this we should create a macro for it. This is optimization that might not always be wanted.

…core_reassignment

Conflicts:
	onvm/onvm_mgr/onvm_nf.c
	onvm/onvm_nflib/onvm_msg_common.h
@koolzz
Copy link
Member Author

koolzz commented Apr 1, 2019

@onvm updated.

@onvm
Copy link

onvm commented Apr 1, 2019

@onvm updated.

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Apr 1, 2019

@onvm updated.

CI Message

Run successful see results:
[Results from nimbnode30]
Median TX pps for Speed Tester: 35204966

Linter Passed

@koolzz
Copy link
Member Author

koolzz commented Apr 1, 2019

@dennisafa does introducing the macro make sense?

@dennisafa
Copy link
Member

dennisafa commented Apr 1, 2019

@dennisafa does introducing the macro make sense?

Yea, I'm a fan of the macros. It makes sense to be able to enable or disable them (like ONVM_NF_SHUTDOWN_CORE_REASSIGNMENT) based on what your purposes may be.

Edit: I'll give it another test for sanity check

@koolzz koolzz mentioned this pull request Apr 2, 2019
15 tasks
@koolzz
Copy link
Member Author

koolzz commented May 11, 2019

Will update with the new config approach when shared cpu changes are merged

koolzz added 2 commits May 24, 2019 11:28
…core_reassignment

Conflicts:
	onvm/onvm_mgr/onvm_nf.c
	onvm/onvm_nflib/onvm_msg_common.h
	onvm/onvm_nflib/onvm_nflib.c
	onvm/onvm_nflib/onvm_threading.h
@kevindweb
Copy link
Contributor

@onvm should be put in queue

@onvm
Copy link

onvm commented Jun 24, 2019

@onvm should be put in queue

CI Message

Another CI run in progress, adding request to the end of the list

@kevindweb
Copy link
Contributor

@onvm hopefully this doesn't run

@onvm
Copy link

onvm commented Jun 24, 2019

@onvm hopefully this doesn't run

CI Message

Duplicate request already waiting, ignoring message

@onvm
Copy link

onvm commented Jun 24, 2019

@onvm should be put in queue

CI Message

Your results will arrive shortly

Copy link

@onvm onvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@onvm should be put in queue

CI Message

Run successful see results:
✔️ PR submitted to develop branch
✔️ Speed tester performance check passed
✔️ Linter passed

[Results from nimbnode30]

  • Median TX pps for Speed Tester: 39265985
  • Performance rating - 112.19% (compared to 35000000 average)

@kevindweb
Copy link
Contributor

@onvm not sure why nfd isn't working

@onvm
Copy link

onvm commented Jul 26, 2019

@onvm not sure why nfd isn't working

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Jul 26, 2019

@onvm not sure why nfd isn't working

CI Message

Error: ERROR: Failed to post results to GitHub

@kevindweb
Copy link
Contributor

@onvm this should be better now

@onvm
Copy link

onvm commented Jul 26, 2019

@onvm this should be better now

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Jul 26, 2019

@onvm this should be better now

CI Message

Error: ERROR: Failed to post results to GitHub

@kevindweb
Copy link
Contributor

@onvm test

@kevindweb
Copy link
Contributor

@onvm now really test

@onvm
Copy link

onvm commented Jul 26, 2019

@onvm now really test

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Jul 26, 2019

@onvm now really test

CI Message

Error: ERROR: Failed to fetch results from nimbnode17

@kevindweb
Copy link
Contributor

@onvm please send 0

@onvm
Copy link

onvm commented Jul 27, 2019

@onvm please send 0

CI Message

Your results will arrive shortly

Copy link

@onvm onvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@onvm please send 0

CI Message

Run successful see results:
✔️ PR submitted to develop branch
❌ PR drops Pktgen performance below minimum requirement
✔️ Speed Test performance check passed
✔️ mTCP performance check passed
✔️ Linter passed

[Results from nimbnode17]

  • Median TX pps for Speed Tester: 41231422
    Performance rating - 103.08% (compared to 40000000 average)

  • Median TX pps for Pktgen: 6420024
    Performance rating - 64.20% (compared to 10000000 average)

  • Time (ms) per request for mTCP and Apache Benchmark: 0.235000
    Performance rating - 102.17% (compared to 0.230000 average)

@koolzz koolzz added this to the ONVM v19.07 milestone Jul 31, 2019
@koolzz
Copy link
Member Author

koolzz commented Aug 2, 2019

@onvm
Updated with the latest version

@onvm
Copy link

onvm commented Aug 2, 2019

Testing

CI Message

Your results will arrive shortly

Copy link

@onvm onvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us see

CI Message

Run successful see results:
✔️ PR submitted to develop branch
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
❌ PR drops mTCP performance below minimum requirement
✔️ Linter passed

[Results from nimbnode17]

  • Median TX pps for Speed Tester: 40206604
    Performance rating - 100.52% (compared to 40000000 average)

  • Median TX pps for Pktgen: 10184821
    Performance rating - 101.85% (compared to 10000000 average)

  • Time (ms) per request for mTCP and Apache Benchmark: 0.075000
    Performance rating - 32.61% (compared to 0.230000 average)

@kevindweb
Copy link
Contributor

I'll fix that mTCP thing (not a bug apache benchmark just went faster), I'm just trying to make sure pktgen is working

@koolzz koolzz merged commit e217d71 into sdnfv:develop Aug 3, 2019
@onvm
Copy link

onvm commented Aug 6, 2019

Testing

CI Message

Error: ERROR: Failed to copy ONVM files to nimbnode17

@onvm
Copy link

onvm commented Aug 6, 2019

Testing

CI Message

Error: ERROR: Script failed on nimbnode17

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants