-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lib: Fix startup problem with namespaces and interface names #17650
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should always receive "ifname" in case of vrflite backend, and always receive "netns:ifname" in case of netns backend. If you see that it works differently, then there should be some kind of a race condition, when a daemon receives configuration before it knows the backend type.
The change itself is correct, but it won't fix the actual underlying problem. And I worry it will make the problem even more hidden and difficult to debug. If I would change something in this code, I would rather add some asserts for vrflite case, to make sure that we never receive an ifname with a colon in its name.
A bigger problem is that it's not the only place where netns backend affects daemons. If you grep for vrf_is_backend_netns
/vrf_get_backend
you'll see that it's used in several daemons and also in if/vrf lib functions. So if a daemon starts processing configuration before it knows the backend type, it will end up in a bad state anyway.
To fix all potential race conditions, I propose to make -n
/-vrfwnetns
option global and require users to specify it for all daemons. Making it a startup option for all daemons makes sense, because we don't support changing the backend in runtime anyway. It's also easy to do using frr_global_options
variable in frr/daemons file.
What do you think?
static void netns_ifname_split(const char *xpath, char *ifname, char *vrfname) | ||
/* | ||
* sometimes the interface name passed is just the interface name `ifname` | ||
* sometimes it's `<vrf name>:<interface>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is true. At least it shouldn't be. For vrf-lite it should always be just ifname, and for netns it should always be netns:ifname.
Currently we have a startup problem with interface yang data passed through the system. The sequence of events is this: a) mgmtd is sending yang data to daemons before the daemon is connected to zebra. b) the upper level daemon connects to zebra and gets if zebra is configured to use namespace based vrfs. Since a) happens before b), lib/if.c is using operational data about the system to decide if the yang passed in is valid. Thus making it invalid in the above sequence of events. If b) happens before a) then things just work. During debugging I noticed that the interface name passed in during yang was one of three things: a) `<namespace name>:<interface name>` c) `<interface name>` Remove the dependancy on knowing the backend to parse the interface name. If there is a delimiter grab the interface name as appropriate and put in the namespace name. If there is no delimeter just return a empty vrf name and set the interface name as appropriate. With this change we just need to look at the vrf namespace name to decide if it is necessary to treat as namespace or not. Signed-off-by: Donald Sharp <sharpd@nvidia.com>
9bd8ef1
to
6c7d696
Compare
Currently we have a startup problem with interface yang data passed through the system. The sequence of events is this:
a) mgmtd is sending yang data to daemons before the daemon is connected to zebra.
b) the upper level daemon connects to zebra and gets if zebra is configured to use namespace based vrfs.
Since a) happens before b), lib/if.c is using operational data about the system to decide if the yang passed in is valid. Thus making it invalid in the above sequence of events. If b) happens before a) then things just work.
During debugging I noticed that the interface name passed in during yang was one of three things:
a)
<namespace name>:<interface name>
b)
<vrf name>:<interface name>
c)
<interface name>
Remove the dependancy on knowing the backend to parse the interface name. If there is a delimiter grab the interface name as appropriate and put in the vrf name either the vrf name or the namespace name. If there is no delimeter just return a empty vrf name and set the interface name as appropriate.