-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory #216
Memory #216
Conversation
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In response to PR creation
CI Message
Run successful see results:
Test took: 5 minutes, 29 seconds
✔️ Pktgen performance check passed
❌ PR drops Speed Test performance below minimum requirement
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 40347308
Performance rating - 96.07% (compared to 42000000 average)
@onvm Test please. |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onvm Test please.
CI Message
Run successful see results:
Test took: 5 minutes, 26 seconds
✔️ Pktgen performance check passed
❌ PR drops Speed Test performance below minimum requirement
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 40462067
Performance rating - 96.34% (compared to 42000000 average)
@onvm Test again please. |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onvm Test again please.
CI Message
Run successful see results:
Test took: 5 minutes, 27 seconds
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 42008679
Performance rating - 100.02% (compared to 42000000 average)
@onvm Test whether rte_calloc caused rate drop. |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onvm Test whether rte_calloc caused rate drop.
CI Message
Run successful see results:
Test took: 5 minutes, 25 seconds
✔️ Pktgen performance check passed
❌ PR drops Speed Test performance below minimum requirement
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 40410498
Performance rating - 96.22% (compared to 42000000 average)
@onvm Test with first solution without rte_calloc please. |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onvm Test with first solution without rte_calloc please.
CI Message
Run successful see results:
Test took: 5 minutes, 28 seconds
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 41983113
Performance rating - 99.96% (compared to 42000000 average)
@bdevierno1 This is awesome! What did CI not like about your first commit? I haven't seen it that slow in a while. Also sorry I didn't see this before, but I don't think we should be changing |
Cool that's fine. It took a few tries to figure out but it really does not like rte_calloc in main.c I'm going to test if there is any impact on performance when I call the function in other areas. |
@onvm Test whether removing rte_calloc on libc as well impacts performance. |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@onvm Test whether removing rte_calloc on libc as well impacts performance.
CI Message
Run successful see results:
Test took: 5 minutes, 25 seconds
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 42220573
Performance rating - 100.53% (compared to 42000000 average)
CI MessageYour results will arrive shortly |
CI MessageError: ERROR: Failed to post results to GitHub |
@onvm Please post results. |
CI MessageYour results will arrive shortly |
CI MessageError: ERROR: Failed to post results to GitHub |
CI MessageYour results will arrive shortly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI Message
Run successful see results:
Test took: 5 minutes, 23 seconds
✔️ Pktgen performance check passed
✔️ Speed Test performance check passed
[Results from nimbnode23]
-
Median TX pps for Pktgen: 7721721
Performance rating - 100.28% (compared to 7700000 average) -
Median TX pps for Speed Tester: 42266681
Performance rating - 100.63% (compared to 42000000 average)
Okay looks like using rte_zmalloc made a difference. Ready for review @catherinemeadows @dennisafa |
onvm/onvm_mgr/main.c
Outdated
tx_mgr[i]->nf_rx_bufs = rte_calloc(NULL, MAX_NFS, sizeof(struct packet_buf), RTE_CACHE_LINE_SIZE); | ||
if (tx_mgr[i]->nf_rx_bufs == NULL) { | ||
RTE_LOG(ERR, APP, "Can't allocate packet_buf struct\n"); | ||
onvm_main_free(tx_lcores,rx_lcores, tx_mgr, rx_mgr, wakeup_ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that calling onvm_main_free after every check may be redundant, could we maybe set a flag and move that call out of the loop so that it's called only once? We should fail on any unsuccessful allocation anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think calling onvm_main_free followed by RTE_EXIT would be best? If not I could also see using a flag or a goto label as another option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bdevierno1 you probably want to just set a flag and break out of the loop. Then just log that we couldn't allocate the structs required to continue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just made a change as per your suggestions. I ended up using a goto flag as using a break would exit the for loop. If I used a break in the first loop it will still enter the next loop.
Can someone explain the difference between zmalloc and calloc? The dpdk doc entries looked quite similar so idk why zmalloc wouldn’t cause the problems calloc did |
They're equivalent, but |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work. We can also add a doc that maybe guides devs on using DPDK correctly in the system. Maybe in our wiki
Also I don't think that this has dependencies on the other DPDK pr (refactoring to the new version) but we should double check
@catherinemeadows will mark this for approval since she reviewed it previously |
Related to issues #200 and #201
Removed a few memory leaks. Replaced some libc function calls with the corresponding recommended dpdk function calls. Created new function to free allocated memory upon manager termination. Added if statements to error check after every allocation.
Summary:
Run manager as per usual. To test with valgrind, this is highlighted in issue #200 Note that it may take up to 20 mins to finish. Possibly much longer than that if you compile with line numbers.
Here are DPDK's recommendations
DPDK highlights that replacing the libc functions will provide a performance benefit. In the tests that I ran, I did not see any significant changes in the packet speed. This is probably due to the function calls only being made once in the NF setup and not continuously being called in the NFs packet handler.
There are some libc allocations remaining. Noticeably in onvm config. These functions are called to parse the json config upon booting with a json. They are called before the initialization of dpdk in onvm_nflib_init() therefore using rte_malloc will not be possible nor will it have an impact on performance.
Usage:
Merging notes:
TODO before merging :
Test whether changes to onvm/onvm_nflib/onvm_nflib.c by using rte_calloc is causing possible packet rate loss.
Test Plan:
After: Memory definitely lost now zero.
Before:
Note that I dont believe valgrind is optimized very well for dpdk. Many of the existing 'errors' picked up by valgrind are caused during the Initialize the Environment Abstraction Layer (EAL) by DPDK function calls such as rte_eal_init(). I believe these can be ignored. Note that DPDK recommends using their cleanup function. However doing so caused errors in compile time: "is deprecated: Symbol is not yet part of stable ABI [-Werror=deprecated-declarations]"
Review:
@kevindweb