Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(zfcp_rules): use the right wwpn and lun vars and support new device format #2631

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

aafeijoo-suse
Copy link
Member

Commit c8e5312 introduced a regression that causes the parsed _wwpn and _lun from SCSI device nodes to not be passed to the create_udev_rule function.

Also, the format of these udev-created SCSI device nodes has changed [1], now it looks like: ccw-<device_bus_id>-fc-<wwpn>-lun-<lun>-part<n>

Fixes c8e5312

[1] https://www.ibm.com/docs/en/linux-on-systems?topic=nodes-udev-created

Checklist

  • I have tested it locally
  • I have reviewed and updated any documentation if relevant
  • I am providing new code and test(s) for it

@aafeijoo-suse aafeijoo-suse requested a review from tblume February 28, 2024 16:04
@github-actions github-actions bot added modules Issue tracker for all modules zfcp_rules Issues related to the zfcp_rules module labels Feb 28, 2024
…ce format

Commit c8e5312 introduced a regression that
causes the parsed `_wwpn` and `_lun` from SCSI device nodes to not be passed
to the `create_udev_rule` function.

Also, the format of these udev-created SCSI device nodes has changed, now it
looks like: `ccw-<device_bus_id>-fc-<wwpn>-lun-<lun>-part<n>`

Fixes c8e5312
@aafeijoo-suse aafeijoo-suse force-pushed the zfcp-rules-parse-wwpn-lun-fix branch from fb0677e to 829aac6 Compare February 29, 2024 07:22
@aafeijoo-suse
Copy link
Member Author

Similar to #2630, this file will be removed by #2534. Now the zfcp rules would be created by https://github.com/ibm-s390-linux/s390-tools/blob/master/zdev/dracut/95zdev/parse-zfcp.sh, although I don't see support for root= there (only for rd.zfcp, but maybe I'm overlooking something).

Copy link
Collaborator

@tblume tblume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@steffen-maier
Copy link

Similar to #2630, this file will be removed by #2534. Now the zfcp rules would be created by https://github.com/ibm-s390-linux/s390-tools/blob/master/zdev/dracut/95zdev/parse-zfcp.sh, although I don't see support for root= there (only for rd.zfcp, but maybe I'm overlooking something).

It could very well be that I'm overlooking something.

My reasoning for not carrying over the code regarding root=, resume=, and using information about the boot (ipl) device was:
Users need to use (dm) multipathing.
https://www.ibm.com/docs/en/linux-on-systems?topic=know-scsi-disk-device-nodes#scsi_nodes__title__2
https://public.dhe.ibm.com/software/dw/linux390/lvc/zFCP_Best_Practices-BB-Webcast_201805.pdf#page=15
If root= or resume= would contain a (persistent) storage device name that happens to represent one single (z)FCP path (such as a by-path symlink), it would be a wrong setup.
The initial boot (IPL) indeed does happen over one single path. However, I would expect users to expect redundancy even in early initrd, which means they need more than the one root-fs path activated; plus, in general, the boot record could be on a different disk than the root-fs so the boot device does not necessarily provide information about paths to the root-fs. So initrd needs a way to know all paths to all dependencies to mount the root-fs.

  • In hostonly mode this information is stored inside initrd (this seems the default for typical distros). ibm-s390-linux/s390-tools@f2c82bf does it in a distro-independent way (chzdev --export). This and 95zfcp_rules/module-setup.sh make use of the dracut helper fuction for_each_host_dev_and_slaves_all to walk the root-fs dependency graph. ibm-s390-linux/s390-tools@06a30ae extends the former commit further (note that the support for hostonly-cmdline is somewhat duplicate because the former commit already stores the device configuration with chzdev's persistent config mechanism (currently happens to be udev rules) in initrd, so rd.zfcp entries in a stored cmdline are not necessary, but don' hurt either because parsing of rd.zfcp also uses chzdev and so we won't have duplicate device configuration because chzdev merges input for the same devices).
  • Without hostonly mode, users could specify multiple rd.zfcp= on the boot cmdline.

What do you think?

@aafeijoo-suse
Copy link
Member Author

It could very well be that I'm overlooking something.

My reasoning for not carrying over the code regarding root=, resume=, and using information about the boot (ipl) device was: Users need to use (dm) multipathing. https://www.ibm.com/docs/en/linux-on-systems?topic=know-scsi-disk-device-nodes#scsi_nodes__title__2 https://public.dhe.ibm.com/software/dw/linux390/lvc/zFCP_Best_Practices-BB-Webcast_201805.pdf#page=15 If root= or resume= would contain a (persistent) storage device name that happens to represent one single (z)FCP path (such as a by-path symlink), it would be a wrong setup. The initial boot (IPL) indeed does happen over one single path. However, I would expect users to expect redundancy even in early initrd, which means they need more than the one root-fs path activated; plus, in general, the boot record could be on a different disk than the root-fs so the boot device does not necessarily provide information about paths to the root-fs. So initrd needs a way to know all paths to all dependencies to mount the root-fs.

* In hostonly mode this information is stored inside initrd (this seems the default for typical distros). [ibm-s390-linux/s390-tools@f2c82bf](https://github.com/ibm-s390-linux/s390-tools/commit/f2c82bf2b5043313174d13ca464ee670b9a19681) does it in a distro-independent way (`chzdev --export`). This and 95zfcp_rules/module-setup.sh make use of the dracut helper fuction `for_each_host_dev_and_slaves_all` to walk the root-fs dependency graph. [ibm-s390-linux/s390-tools@06a30ae](https://github.com/ibm-s390-linux/s390-tools/commit/06a30ae529a5d6ad2369ed81da056bf3a6147bb6) extends the former commit further (note that the support for hostonly-cmdline is somewhat duplicate because the former commit already stores the device configuration with chzdev's persistent config mechanism (currently happens to be udev rules) in initrd, so rd.zfcp entries in a stored cmdline are not necessary, but don' hurt either because parsing of rd.zfcp also uses chzdev and so we won't have duplicate device configuration because chzdev merges input for the same devices).

* Without hostonly mode, users could specify multiple rd.zfcp= on the boot cmdline.

What do you think?

Thanks for this extensive explanation, definitely it was me who was overlooking the required expertise with this kind of setups.

The bugs of this PR and #2630 appeared last week while we were testing our latest distro, but they are not fixing critical errors, because nowadays the dracut modules shipped with s390-tools are doing the most important job. The problem is these old upstream s390 modules are being included by default and show confusing errors in the system log.

Therefore, upstream needs to move to #2534 soon and remove obsolete s390 modules.

@steffen-maier
Copy link

The bugs of this PR and #2630 appeared last week while we were testing our latest distro, but they are not fixing critical errors, because nowadays the dracut modules shipped with s390-tools are doing the most important job. The problem is these old upstream s390 modules are being included by default and show confusing errors in the system log.

Therefore, upstream needs to move to #2534 soon and remove obsolete s390 modules.

Any idea how to make progress on #2534 and get it integrated/merged timely?

Currently, I see 136 open pull requests in dracut, including your #2630 and #2631.

I'm not familiar with the requirements to merge dracut PRs. For #2534, it states "At least 2 approving reviews are required by reviewers with write access". Harald, Jóhann, and Lukáš are required reviewers. Dan kindly provided review with approved flag. But I'm not sure if that counts towards the required 2 reviews, because he might not have write access. Github doc sounds like it won't count because Dan's review approval has grey but not green color.

Maybe it would help, if you (@aafeijoo-suse) and Thomas (@tblume) could also kindly provide a review (approval) for #2534?

Here are some examples of what's in it for you (see also SUSE bug 1210597 comment from 2023-10-13):

  • Support for distro-independent rd.znet=, which is very useful for distro installation on s390 where they are often network based. E.g. to pull a large installer ramdisk, such as a live system image, which does not fit into the installer initrd built by dracut and used for the initial boot of the installer. This includes debug level support during build of an initrd. 7cf164b a52b248
  • A distro-independent mechanism to transfer user choices at boot time (rd.dasd, rd.zfcp, rd.znet) from the volatile dracut initrd into the next root file system being switched to during initrd processing. This is very useful for debugging and for distro installers. The latter can import this information as persistent s390 device configuration into an installer ramdisk / live image, add additional interactive user choices from the installer UI or unattended installation answer file, and finally export the complete configuration to the installation target disk. For the import and export, the installer does not have to know anything about s390 device types or their properties, it's competely opaque and transparent by just calling the chzdev tool to perform the job. Admittedly, this mostly came with consolidated s390 device configuration ibm-s390-linux/s390-tools#158 and thus https://github.com/ibm-s390-linux/s390-tools/releases/tag/v2.31.0 for rd.dasd and rd.zfcp; but for rd.znet it requires above change in dracut to use chzdev as backend for rd.znet.
  • Fixed parsing of rd.zfcp=. Admittedly, this mostly came with consolidated s390 device configuration ibm-s390-linux/s390-tools#158 and thus https://github.com/ibm-s390-linux/s390-tools/releases/tag/v2.31.0 using a cmdline hook with priority 95. So that runs in addition to and after 95zfcp_rules with cmdline hook at prio 30.
  • S390 cio_ignore handling is completely and transparently handled by chzdev. No more need for /boot/zipl/active_devices.txt or 81cio_ignore (except for maybe sensing ignored devices, which might not be handled by chzdev). The description of 7cf164b contains how this works (inst_rules[_wildcard] does not work). [see also SUSE bug 1216257]
  • Better chances of a successful kdump in large production environments with many (disk) devices [dasd or zfcp] by not copying configuration of unnecessary devices into initrd (i.e. only copying the config of required devices for the kdump target).
  • Theoretically, you could even use the now distro-independent 80cms, e.g. in an installer environment running in a z/VM guest, to bootstrap a distro installation over the network and to reduce the amount of bootstrap information necessary on the size-limited kernel parameter line.

@aafeijoo-suse
Copy link
Member Author

The bugs of this PR and #2630 appeared last week while we were testing our latest distro, but they are not fixing critical errors, because nowadays the dracut modules shipped with s390-tools are doing the most important job. The problem is these old upstream s390 modules are being included by default and show confusing errors in the system log.
Therefore, upstream needs to move to #2534 soon and remove obsolete s390 modules.

Any idea how to make progress on #2534 and get it integrated/merged timely?

Currently, I see 136 open pull requests in dracut, including your #2630 and #2631.

Ah don't bother about these 2 PRs, both were submitted for "public knowledge" only. The most important and the one we want to address is #2534.

I'm not familiar with the requirements to merge dracut PRs. For #2534, it states "At least 2 approving reviews are required by reviewers with write access". Harald, Jóhann, and Lukáš are required reviewers. Dan kindly provided review with approved flag. But I'm not sure if that counts towards the required 2 reviews, because he might not have write access. Github doc sounds like it won't count because Dan's review approval has grey but not green color.

Maybe it would help, if you (@aafeijoo-suse) and Thomas (@tblume) could also kindly provide a review (approval) for #2534?

As you noticed, dracut upstream is just a "bulletin board" nowadays. The admins and and the distros have abandoned the project (sorry if this statement hurts any feelings, but it's what it's). The remaining non-admin people with write permissions are Thomas and me (both from SUSE). If upstream only consists of SUSE employees with limited rights, it's not a real upstream. So, we prefer to be focused on our downstream GitHub fork and only merge things there (if they add some value to us, of course).

Here are some examples of what's in it for you (see also SUSE bug 1210597 comment from 2023-10-13):

Definitely we will review your PR, both #2630 and #2631 are just an example of the bug reports we are receiving because dracut ships obsolete s390 modules (and documentation as well). Over the past few months, our data center moved to a new location and this caused a disruption to our available resources. As soon as I have an instance to test it on real hw along with the latest version of s390-tools, I'll give you a review. Thanks for your work here!

As a first feedback, it would be great if you could take a look at the dracut man pages and remove the obsolete documentation in your PR.

@steffen-maier
Copy link

steffen-maier commented Mar 6, 2024

Hi Antonio,

thanks for the information. Much appreciated. I wasn't aware.

As soon as I have an instance to test it on real hw along with the latest version of s390-tools, I'll give you a review.

FYI, I successfully function tested ibm-s390-linux/s390-tools#158 with #2534 as well as with openSUSE/kdump#40 on SLES15.

Definitely we will review your PR, both #2630 and #2631 are just an example of the bug reports we are receiving because dracut ships obsolete s390 modules (and documentation as well).

As a first feedback, it would be great if you could take a look at the dracut man pages and remove the obsolete documentation in your PR.

Could you please elaborate a bit more on the obsolete documentation?

a52b248 updates (and keeps) the documentation of rd.znet (and rd.znet_ifname).

I think we need a well-known central documentation place, which is why I intentionally kept rd.dasd and rd.zfcp in dracut's man/dracut.cmdline.7.asc. ff81fa5, 9633857, e229f02, 012b26a, and 48cff12, contain the following commit description part:

Even though this removes one implementation of parsing {rd.zfcp,rd.dasd} in dracut,
above s390-tools change introduces another implementation of parsing the
exact same {rd.zfcp,rd.dasd} syntax. Therefore, it would be good to keep the
documentation in man/dracut.cmdline.7 of dracut as one central place
describing all s390 device types that dracut handles.

s390-tools zdev (incl. its dracut modules part) is packaged such that it is part of any minimal distro installation on s390. Hence, dracut on s390 also has the zdev dracut modules available. If necessary and not done yet, dracut packaged for s390 could have an explicit Requires dependency on the (sub)package of s390-tools meant to be part of any minimal installation. Therefore, I think that we can still consider the functionality of rd.dasd and rd.zfcp (have always been s390-specific) being part of core dracut on s390.

I fear users would be confused if the long-standing documentation entries would vanish from dracut. An alternative would be to move the two dracut cmdline option descriptions to something like the deprecated section (does not really fit there). Or just let them where they are and replace the description text with a reference elsewhere, but we don't have this "elsewhere" currently in s390-tools, plus I would not like it as user / reader to all of a sudden have to read multiple independent docs at different places. @oberpar, @vneethv, we had briefly touched this documentation topic a while ago, what do you think?

rd.zfcp.conf will only become obsolete / deprecated (without functionality as it's no longer necessary) with f98d41e from #2632. I'm going to update man/dracut.cmdline.7.asc in that commit.

@aafeijoo-suse
Copy link
Member Author

I fear users would be confused if the long-standing documentation entries would vanish from dracut.

Yes, that's right. But now dracut won't be responsible for handling these command line options anymore. Maybe we could keep the entries but add "this is not supported by dracut anymore" message or something like that, and a reference to the s390-tools man page that describes them, which is the place where this is handled now.

@johannbg
Copy link
Collaborator

As you noticed, dracut upstream is just a "bulletin board" nowadays. The admins and and the distros have abandoned the project (sorry if this statement hurts any feelings, but it's what it's). The remaining non-admin people with write permissions are Thomas and me (both from SUSE). If upstream only consists of SUSE employees with limited rights, it's not a real upstream. So, we prefer to be focused on our downstream GitHub fork and only merge things there (if they add some value to us, of course).

@aafeijoo-suse not abandoned, been busy at dayjob for past 2 years not particular problem at the moment since I quit that job few days back due to me being pretty much overworked, I guess people these days would call "burned out".

Beside me and Harald holding ownership over the project(s), you and I have maintainership permission over dracut repo, tblume has write access along with 2 other RH employees which I trust but are not particular active in the project outside of their line of work inside RH. No one from RH/BRNO will be granted elevated access after the access/release shenanigans. Any employee from RH will start at the same level as any other contributor to the project that is not employed there as in on ground 0, having to work it's way up just like anyone else. The only way that will be changed is if I will have direct discussion with someone I'm familiar with from RH/Raleigh

Anyways back to the s390-tools PR's what's the current status of that and in which direction are the upstream 390 taking with regards to init systems, is it being generic or moving towards systemd only?

That said I'm not particularly fond of "partial" upstream, with that I mean we should not support some aspects of 390 here upstream while other parts of it is being maintained in 390 upstream repo which will double the load on everyone involved and quadruple it for anykind of support issues.

@dtardon
Copy link
Contributor

dtardon commented Mar 22, 2024

Beside me and Harald holding ownership over the project(s),

You mean you're holding ownership. Harald might still retain his, but he's not been involved for years.

you and I have maintainership permission over dracut repo, tblume has write access along with 2 other RH employees which I trust but are not particular active in the project outside of their line of work inside RH

Inactive project members might just as well not exist at all. Any outside observer can see that the project is dysfunctional. Issues are ignored, PRs aren't being reviewed (just 2 trivial ones have been merged this year), questions aren't being answered. That PRs are being closed as "stale" by a bot after months of waiting for review must feel like a ridicule to the submitters. A group of downstream maintainers just created a new release outside of the project (Fedora announcement). IOW, the way you--as the project owner--are managing the project is not working. The sooner you admit that the better for dracut.

No one from RH/BRNO will be granted elevated access after the access/release shenanigans. Any employee from RH will start at the same level as any other contributor to the project that is not employed there as in on ground 0, having to work it's way up just like anyone else.

I'm at loss at what you're expecting here. There isn't any queue of people eager to contribute to dracut inside Red Hat. The downstream maintainers are still the same people you kicked off.

Copy link

stale bot commented Apr 22, 2024

This issue is being marked as stale because it has not had any recent activity. It will be closed if no further activity occurs. If this is still an issue in the latest release of Dracut and you would like to keep it open please comment on this issue within the next 7 days. Thank you for your contributions.

@stale stale bot added the stale communication is stuck label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules Issue tracker for all modules stale communication is stuck zfcp_rules Issues related to the zfcp_rules module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants