Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nvmf / IPv6 related fixes for SLE15-SP5 #254

Merged
merged 7 commits into from
Mar 15, 2023

Conversation

mwilck
Copy link

@mwilck mwilck commented Feb 9, 2023

Fixes for IPv6 issues that were found in NVMeoF boot testing.

Checklist

  • I have tested it locally
  • I have reviewed and updated any documentation if relevant
  • I am providing new code and test(s) for it

This is equivalent to the update of the usptream PR dracutdevs#2184 that I did today (91e1ee9).

ibft_to_cmdline() formats a static IPv6 address like this
(without peer, gateway, hostname):

  ip="[fd09:9a46:b5c1:1fe:1::10/64]:::::ibft0:none"

This means that the "mask" field (4th) is left blank. When this is
parsed later by parse-ip-opts.sh, it bails out with the error
message "Sorry, automatic calculation of netmask is not yet supported".

parse-ip-opts.sh rather expects the prefix in the 4th field:

  ip="[fd09:9a46:b5c1:1fe:1::10]:::64::ibft0:none"

This syntax will be correctly transformed to the command

  ip addr add fd09:9a46:b5c1:1fe::10/64 dev ibft0

This patch fixes the formatting of the "ip=" line in ibft_to_cmdline().
Assuming a default prefix length of 64 by default if no explicit
prefix length is given is wrong and known to cause connectivity
problems in some networks. A prefix length of 128 should be assumed
in this case.

See https://www.rfc-editor.org/rfc/rfc5942 (specifically section 5)
    https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=684009
For static IPv4 configurations, we use arping to check for duplicate
IP addresses. arping requires the af_packet module to work, and if arping
fails, the IP address will not be set and booting will fail.
af_packet may not be loaded / required in the running system, for example if
the system had been booted in an IPv6 configuration, or if it had been
manually unloaded. Make sure it's included in initramfs in hostonly mode, too.
This patch reverts commit
c603419 ("wait for IPv6 RA if using none/static IPv6 assignment").
It's not generally correct wait for a default route to be established
for an interface, or to wait for "proto ra" routes in general.
For example, if the system is a router itself, it will receive no
RAs. In isolated networks, no gateway may be advertized, either.
This is similar in spirit to 76f6566 ("Revert "wait for IPv6 RA
if using none/static IPv6 assignment"")

Whatever c603419 ("wait for IPv6 RA if using none/static IPv6 assignment")
was supposed to achieve, it should be done differently.
Copy link
Collaborator

@tblume tblume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, calling the nvmf-autoconnect.sh script in multiple stages might cause quite a delay until the boot finally succeeds or fails.
@mwilck how long was the the delay in your tests in the worst case?

@mwilck
Copy link
Author

mwilck commented Feb 9, 2023

Hm, calling the nvmf-autoconnect.sh script in multiple stages might cause quite a delay until the boot finally succeeds or fails. @mwilck how long was the the delay in your tests in the worst case?

In my testing, nvme connect-nbft has always been very fast. What takes long is the network setup. But of course I can't prove that under certain condition the connect command itself might run into timeouts.

Note that (in the current code) I haven't installed a scipt in the main "initqueue" which is called every second.
The logic I implemented is the same that's used for iSCSI, using the "online", "settled", and "timeout" hooks. In my testing, 90% of the time the script run in the "online" hook succeeds immediately. "timeout" is just the last resort, and will never be reached except in already problematic situations. "settled" will be called a few times, but not very often.

@tblume
Copy link
Collaborator

tblume commented Feb 10, 2023

Ok, thanks Martin.
LGTM

@mwilck mwilck force-pushed the SLE15-SP5-nbft-ipv6 branch from f8e87c9 to c6d4d1d Compare February 10, 2023 10:27
@mwilck
Copy link
Author

mwilck commented Feb 10, 2023

Pushed shfmt fixes.

@mwilck mwilck force-pushed the SLE15-SP5-nbft-ipv6 branch from c6d4d1d to 637beae Compare February 10, 2023 10:37
@mwilck
Copy link
Author

mwilck commented Feb 10, 2023

Fixed commit message as remarked by @aafeijoo-suse on dracutdevs/pull/2184

Rather than trying to connect immediately from parse-nvmf-boot-connections.sh,
move the connect logic to an initqueue script which is called at various
stages (settle, online, and timeout). In the timeout case, just try every
possible connection. Otherwise, use the existing priority logic.

This retry logic is the same that the iscsi module uses.

The nvme connect-nbft command will do "the right thing" when the connections
specified in the NBFT are already established: Already existing connections
will be skipped. Currently nvme-cli will print an error message and exit
with error status even if all targets are already connected. This should be
handled more gracefully on the nvme-cli side.
@mwilck mwilck force-pushed the SLE15-SP5-nbft-ipv6 branch from 637beae to a414e54 Compare February 10, 2023 15:20
@mwilck
Copy link
Author

mwilck commented Mar 11, 2023

What's holding back this PR? It causes almost boot failure for NVMeoF / IPv6 in static configurations. There's practically no way to recover, the system just halts hard.

@aafeijoo-suse
Copy link
Collaborator

What's holding back this PR? It causes almost boot failure for NVMeoF / IPv6 in static configurations. There's practically no way to recover, the system just halts hard.

I thought we wanted to wait until the upstream code is accepted, because the basic tech preview feature is already there, but maybe I misunderstood. Do we need to include this in 15-SP5 GM?

@mwilck
Copy link
Author

mwilck commented Mar 13, 2023

Do we need to include this in 15-SP5 GM?

IMO, yes. These are fixes for the code we already have in beta.
True, it remains problematic that upstream acceptance is going slowly.
But that's not related to this PR.

@aafeijoo-suse aafeijoo-suse merged commit 5603b00 into openSUSE:SLE-15-SP5_GA Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants