-
Notifications
You must be signed in to change notification settings - Fork 662
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
autoconnect nbft systemd service #2078
Conversation
No objections. |
|
||
[Service] | ||
Type=oneshot | ||
ExecStartPre=/sbin/modprobe nvme-fabrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using ExecStartPre
to load the kernel module, I think that the preferred way is to do as follows under the [Unit]
section:
[Unit]
Wants=modprobe@nvme_fabrics.service
After=modprobe@nvme_fabrics.service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds reasonable. Good catch! Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call me stupid, but where are the actual 'connect' calls?
Are they done by the 'nvmf-autoconnect.service'?
If so, shouldn't we add an explicit order like 'Before' here?
If not, how does autoconnect happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, incidentally, we're using 'ExecStartPre' in 'nvmf-autoconnect.in', too. Maybe we should be changing that, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The connect call is there ExecStart=@SBINDIR@/nvme connect-nbft
. The github web UI is not really good for reviewing. See the last line is a below this comment.
Anyway, I think this should be:
Wants=modprobe@nvme_fabrics.service
After=modprobe@nvme_fabrics.service
After=network-online.target
Before=remote-fs-pre.target
Yep, makes sense to update nvmf-autoconnect.in
alongside.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call me stupid, but where are the actual 'connect' calls? Are they done by the 'nvmf-autoconnect.service'? If so, shouldn't we add an explicit order like 'Before' here? If not, how does autoconnect happen?
The nvm connect-all --nbft call is currently done by dracut.
So maybe it's not a good idea to add it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Argl. I so hate the 'nbft' (and it's previous incarnation 'ibft') network device naming.
You always have the trouble to keep the udev rules in sync between initrd and the running system, not to mention the confusion arising when for some reason not booting from nbft
or nbft parsing fails after configuration.
Any ideas how to improve this then? I know with newer kernels a device can have alternative names. If we could make one of these 'stable' we don't have to do this udev rename dance. |
@hreinecke, we have been discussing this up and down in various places. My conclusion is that using a separate name space like See https://bugzilla.suse.com/show_bug.cgi?id=1194910#c45 and follow-up comments. Your point about "not booting from the NBFT" is valid. But TBH, if a system that's normally booted via NBFT now is booted differently, it's a different system. A common problem is the difference between installation time (possibly without NBFT) and runtime (with NBFT), but our NVMe-oF/TCP testing with the Note also that the udev rule in this PR is much simpler than the rule-generator approach that we currently have for iSCSI. Changing iSCSI to use the same approach is on my todo list. The current status quo, like it or not, is that dracut generates the |
Obviously, upstream nvme-cli should not enforce any interface naming policy. But the udev rule in this PR doesn't. It just ensures that names matching a certain pattern, which is distinct from any other interface name patterns, shouldn't be changed. It's debatable whether such a rule should be part of upstream nvme-cli, but don't see a big problem with it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it should work. NVMe/FC boot does rely on nvme-cli, but since the nvmfc-boot-connections.service runs first, and nvme connect-all doesn't currently default --nbft, this should work.
|
||
[Service] | ||
Type=oneshot | ||
ExecStart=@SBINDIR@/nvme connect-nbft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Has any one tested this with nvme-fc boot? We also support nvme-fc boot too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did some testing, but more can't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnmeneghini's comment makes me realize that this is the legacy syntax. It must be nvme connect-all --nbft
for upstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the nvme invocation. Anything else to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still have some concerns about how this will work together with dracut and network manager during nbft boot. I've asked @tbzatek to review this. Can we add Tomas as a reviewer?
https://github.com/tbzatek
I'd also like to test these changes in my nvmft boot testbed before we merge. I'll do that next week.
|
||
[Service] | ||
Type=oneshot | ||
ExecStart=@SBINDIR@/nvme connect-nbft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnmeneghini's comment makes me realize that this is the legacy syntax. It must be nvme connect-all --nbft
for upstream.
Use Wants/After to let systemd handle loading the kernel module, which is the preferred way. Signed-off-by: Daniel Wagner <dwagner@suse.de>
Create a separate unit file for connecting to NBFT-defined subsystems. This unit is intended to be called in "post-up" scripts from network management software if an interface defined in the HFI section of the NBFT is brought up (L3-configured). In simple scenarios with just one HFI, this won't be necessary because the interface must be brought up in the initramfs already. But in multipath scenarios, the initramfs may choose not to wait for every HFI to come up, and thus it may be necessary to bring up the secondary connection(s) later on. Signed-off-by: Martin Wilck <mwilck@suse.com> [dwagner: use unit options instead of ExecStartPre update nvme command line] Signed-off-by: Daniel Wagner <dwagner@suse.de>
In the initramfs, the interface naming is taken care of by dracut. But there are various network-interface-naming policies in place which may attempt to rename the interface, causing confusion and possibly wrong interface parameters. Add an udev rule that avoids renaming any network interface that has been assigned a name nbft$N, which is by convention the naming scheme that is used for NBFT device in the initramfs. Note: The simpler 'NAME:="%k"' directive doesn't work because udev rejects it ('Ignoring NAME="%k", as it will take no effect.'). The ":=" syntax makes sure the interface isn't renamed any more by later rules. "INTERFACE" is set by the kernel in the "add" uevent for a network interface. Signed-off-by: Martin Wilck <mwilck@suse.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a brief look it looks fine, I'm not a systemd expert though. I'm going to test it on a real environment but I don't see anything that should block this pull request.
@johnmeneghini, wrt your concerns about NetworkManager: The autoconnect scripts are only concerned with the NVMe part of the connection establishment process. Making another attempt to connect after switching root makes sense if dracut didn't bring up all connections during initrd processing, which can happen
In case 1., making a second autoconnect attempt after switching root will only have an effect if the network management tool (NetworkManager) had succeeded in bringing up network routes that dracut had failed to bring up. Whether that has any chance to succeed depends on the configuration. In case 2., autoconnect may succeed even with no post-switch-root changes in the network configuration. Either way, this is orthogonal to any attempts to set up NBFT-defined network connections. The service starts after "network-online.target", and that's where NM comes into play. While working towards "network-online.target", NM might read the NBFT, see if all NBFT-defined network interfaces have been configured, and try to bring up any missing ones. I think we all understand that "network-online.target" is a rather clumsy all-or-nothing target which is a good fit for laptops that need a working internet connection, but less so for a server booting from NVMeoF in a complex networking environment. |
Okay, this appears to be working fine on my simple test setup. Uptime 19 hours without any glitch. |
Thanks! |
Add support to auto connect via NBFT.
@mwilck during updating our downstream packages I noticed these patches never made it upstream. I think they should also be upstream. Any objections?