Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zebra: avoid race between FPM pthread and zebra main pthread in netlink encode/decode #17581

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

mjstapp
Copy link
Contributor

@mjstapp mjstapp commented Dec 4, 2024

The current FPM dataplane plugin uses some zebra netlink message encode and decode apis. That seems to have been done to avoid duplication - but there are races present because FPM uses its own pthread, the netlink code touches zebra internal datastructures, and that is not thread-safe. This PR attempts to re-work the netlink apis so that they become safe for FPM use.

Add several additional route attribute data and accessors
to the dplane module.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Mark Stapp added 5 commits December 5, 2024 09:23
Separate core netlink route message parsing into a new api that
uses a dplane ctx to hold the parsed attribute data. Use the new
api in two paths: the normal netlink update message parsing path,
and in the FPM plugin, which also uses netlink encoding. The FPM
route-notificatin code runs in its own pthread, and only needs
a subset of the route info that zebra ordinarily develops. This
change stops that pthread from accessing zebra's internal
data, such as vrfs and ifps, that are not thread-safe.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Remove a route api that's no longer used after refactoring the
netlink and FPM code.

Signed-off-by: Mark Stapp <mjs@cisco.com>
The netlink route message encode function accessed and used
zebra vrfs to produce debug output. That's not thread-safe, and
that encode code needs to run in (at least) the dplane pthread
and the FPM plugin pthread.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Remove use of vrf name, which isn't thread-safe.

Signed-off-by: Mark Stapp <mjs@cisco.com>
Improve a debug when we create a new rib_dest by calling the debug
after setting up the dest.

Signed-off-by: Mark Stapp <mjs@cisco.com>
@mjstapp
Copy link
Contributor Author

mjstapp commented Dec 5, 2024

pushed a fix for CI issues

@mjstapp
Copy link
Contributor Author

mjstapp commented Dec 5, 2024

CI:rerun

@mjstapp
Copy link
Contributor Author

mjstapp commented Dec 5, 2024

doing another full CI run

@donaldsharp
Copy link
Member

lot's of formatting issues :(

@mjstapp
Copy link
Contributor Author

mjstapp commented Dec 9, 2024

I think all of the clang reports are about making lines long; we've agreed that we'll permit longer lines in new code for folks who like them, but we won't mandate them from folks who don't?

lot's of formatting issues :(

Copy link
Member

@rzalamena rzalamena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the code and tested it. Looks good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants