Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load gen segmentation fault fix #173

Merged
merged 8 commits into from
Dec 21, 2019
Merged

Conversation

dennisafa
Copy link
Member

@dennisafa dennisafa commented Dec 17, 2019

We didn't call nf_setup before onvm_nflib_run, so segfault chaos ensued. See #164

Summary:

Added nf_setup before the onvm_nflib_run call banished the segfault. Test it by running manager with ./go.sh 0,1,2,3 3 0xF0 -a 0x7f000000000 -s stdout and load_generator with ./go.sh 1 -d 1
Should get this:
Screen Shot 2019-12-16 at 10 09 13 PM

Usage:

This PR includes
Resolves issues X
Breaking API changes
Internal API changes
Usability improvements
Bug fixes
New functionality
New NF/onvm_mgr args
Changes to starting NFs
Dependency updates
Web stats updates

Merging notes:

  • Dependencies: None

TODO before merging :

  • PR is ready for review

Test Plan:

Run the manager and NF as described above

Review:

@kevindweb tiny change, just need a sanity check

@onvm
Copy link

onvm commented Dec 17, 2019

In response to PR creation

CI Message

Your results will arrive shortly

@dennisafa dennisafa changed the title Load gen bug Load gen segmentation fault fix Dec 17, 2019
Copy link

@onvm onvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In response to PR creation

CI Message

Run successful see results:
✔️ PR submitted to develop branch
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
❌ Linter Failed (please fix style errors)

[Results from nimbnode17]

  • Median TX pps for Pktgen: 10746272
    Performance rating - 107.46% (compared to 10000000 average)

  • Median TX pps for Speed Tester: 40098629
    Performance rating - 100.25% (compared to 40000000 average)

Linter Output

examples/load_generator/load_generator.c:368: Line ends in whitespace. Consider deleting these extra spaces. [whitespace/end_of_line] [4]
examples/load_generator/load_generator.c:369: Line ends in whitespace. Consider deleting these extra spaces. [whitespace/end_of_line] [4]
Total errors found: 2

@kevindweb
Copy link
Contributor

@onvm will you play nice?

@kevindweb
Copy link
Contributor

@onvm maybe with the config?

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm maybe with the config?

CI Message

Your results will arrive shortly

@kevindweb
Copy link
Contributor

@onvm again

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm again

CI Message

Your results will arrive shortly

@kevindweb
Copy link
Contributor

@onvm ubuntu 18! nimbnode1 let's go!

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm ubuntu 18! nimbnode1 let's go!

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm ubuntu 18! nimbnode1 let's go!

CI Message

Error: ERROR: Failed to copy ONVM files to nimbnode17

@onvm
Copy link

onvm commented Dec 17, 2019

pasta

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Dec 17, 2019

pasta 2

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Dec 17, 2019

pasta 2

CI Message

Error: ERROR: Failed to fetch results from nimbnode17

@kevindweb
Copy link
Contributor

@onvm please grab the files this time

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm please grab the files this time

CI Message

Your results will arrive shortly

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm please grab the files this time

CI Message

Error: ERROR: Script failed on nimbnode17

@kevindweb
Copy link
Contributor

@onvm can you be kind

@onvm
Copy link

onvm commented Dec 17, 2019

@onvm can you be kind

CI Message

Your results will arrive shortly

Copy link

@onvm onvm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@onvm can you be kind

CI Message

Run successful see results:
✔️ PR submitted to develop branch
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
✔️ Linter passed

[Results from nimbnode17]

  • Median TX pps for Pktgen: 10746272
    Performance rating - 107.46% (compared to 10000000 average)

  • Median TX pps for Speed Tester: 40125988
    Performance rating - 100.31% (compared to 40000000 average)

Copy link
Contributor

@kevindweb kevindweb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dennisafa Thanks for fixing this! Here's a screenshot of my runs. I ran load_generator, and to test more than CI, ran 2 talking speed_testers in addition. I instantiated and destroyed them in random orders which worked fine, so cleanup worked. The fix seems to work well and makes sense that we just forgot the context setup from a few months ago.
Screen Shot 2019-12-20 at 5 56 23 PM

@koolzz
Copy link
Member

koolzz commented Dec 21, 2019

@dennisafa good find, this seems easily detectable if we set some flags during init process. Can you create a task for that/submit a pr if you have time.

@koolzz koolzz merged commit dd4cb21 into sdnfv:develop Dec 21, 2019
@dennisafa
Copy link
Member Author

thanks @kevindweb and @koolzz! yes, I will look into modifying init to check for this.

@dennisafa dennisafa deleted the load_gen_bug branch December 21, 2019 04:17
@Yongeverhuang
Copy link

May i depict a problem , in ./go.sh 0,1,2,3 3 0xF0 -s stdout the load-generator run well, but i test in share core **./go.sh 0,1,2,3 3 0xF0 -s stdout -c ** ,the load-generator not run . In OpenNetVM V 19.07.

@dennisafa
Copy link
Member Author

@Yongeverhuang Thanks for your comment. The only way I could replicate this issue was by having no active igb_uio devices loaded. If an active port is not detected, then mempools are not properly set up leading to that issue. Please verify that the manager + load_generator look like this when active:
Screen Shot 2019-12-23 at 11 17 30 AM

As a side note, a warning message in the NF should pop up indicating that no port is present.

@Yongeverhuang
Copy link

Yongeverhuang commented Dec 24, 2019

@dennisafa Thanks for help . Let me make a full description . In no share core way, run ./go.sh 0,1,2,3 0x1 0xF0 -s stdout in the manager ,the load_generator work well, like this
未命名图片
But then ,run in share mode , it doesn't work ,amd no warning message pop up .Like this , the load_generator not run unless ctrl -c stop it ,the screen print out that.
未命名图片2
It seems looks like usual when run other NF when in share core mode. Such speed-tester .
未命名图片3

@dennisafa
Copy link
Member Author

Thank you for the screenshots. What branch of openNetVM are you running? Also, what command are you running to initiate the manager in shared core mode?
It also seems like you are using the -t argument. This will shut down the NF after a period of time.

@Yongeverhuang
Copy link

Yongeverhuang commented Dec 25, 2019

@dennisafa
Thanks! In my test ,i use the latest OpenNetVM.
In no share mode :
manager: ./go.sh 0,1,2,3 0x1 0xF0 -s stdout
load_generator : ./start_nf.sh load_generator 1 -d 1 -t 400000
In share mode
manager: ./go.sh 0,1,2,3 0x1 0xF0 -s stdout -c
load_generator : ./start_nf.sh load_generator 1 -d 1 -t 400000

-t argument just define how many pkts sent ,it will not shut down all the time in no share mode , unless using ctrl -c to stop it .
But ,in share mode ,the load_generator could not run , no thing to pop out like this:
捕获
In manager ,the NF has generator:
捕获1

Anyway,i also use the commands : manager with ./go.sh 0,1,2,3 0x1 0xF0 -a 0x7f000000000 -s stdout -c and load_generator with ./go.sh 1 -d 1.It performances not run too.

@dennisafa
Copy link
Member Author

Thanks! What branch are you using? We recently fixed the load_generator network function, but have not yet integrated the changes into the latest version of ONVM (master).

@Yongeverhuang
Copy link

Thanks ! I have solved the problem. Now the load_generator network function works well.

@twood02 twood02 added this to the ONVM 20.05 milestone Apr 9, 2020
@twood02 twood02 added NeedsReleaseNote Needs updated release note info and removed NeedsReleaseNote Needs updated release note info labels May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants