-
Notifications
You must be signed in to change notification settings - Fork 908
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cc_grub_dpkg: determine idevs in more robust manner with grub-mkdevicemap #358
Conversation
ae04dd1
to
8b20448
Compare
Thanks for the submission, Matthew! There's some discussion of this approach happening in the bug, so I'm marking this "wip" until we've reached a consensus. |
8b20448
to
2759d58
Compare
This new revision drops nearly all the complexity from the previous version. Changes:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really good I like it ! I think this should be merged now.
Personally i would have parsed udevadm output, but walking symlinks in /dev/disk is valid too. Somebody might want to refactor it that way later. But that's just style/attitude refactor, rather than a change of behaviour.
cloudinit/config/cc_grub_dpkg.py
Outdated
devlist = util.subp(['awk', '{print $1}'], | ||
data=devlist, capture=True)[0].strip() | ||
for device in devlist.split(): | ||
if str(Path(device).resolve()) == disk: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all of the above can be done in one go:
$ udevadm info --query property $(grub-probe -t disk /boot) | grep -e DEVNAME= -e DEVLINKS
DEVNAME=/dev/nvme0n1
DEVLINKS=/dev/disk/by-path/pci-0000:04:00.0-nvme-1 /dev/disk/by-id/nvme-eui.00000000000000006479a71d90513837 /dev/disk/by-id/nvme-Sabrent_27FF079419DE00131387
Which queries grub-probe device, and gives you DEVNAME= and DEVLINKS=, if there is /by-id/ in devlinks you can use the first /by-id/ one there, otherwise fallback to /by-path/, or DEVNAME itself.
If that fails, it should produce no stdout output and only stderr messages i.e.:
echo $(udevadm info --query property $(grub-probe -t disk /non-existant) | grep -e DEVNAME= -e DEVLINKS) 2>/dev/null
grub-probe: error: failed to get canonical path of `/non-existant'.
A device name or path is required
But that's just style / refactor comment. What you do here is correct and valid too. And because I love using udevadm
so much. Walking /dev/disk/by-id
should be the same and probably easier to handle spaces and escaping.
Thanks for the review @xnox! Just for clarity, @matthewruffell, I asked Dimitri to review the PR from a grub POV. I'm going to follow-up with a code review from a cloud-init perspective in the next day or two. (We have no indication that this is a high priority, so I've just added this to my TODO list for now. If there's some time sensitivity here that I'm not aware of, please let me know!) |
Thank you for the review @xnox, and for your insight into udevadm as an alternative to walking symlinks. @OddBloke Thanks for being clear with next steps. There is some time sensitivity, but nothing major, with SLA expiry coming up mid next month on 2020-06-14. The bug this patch fixes is more of a nuisance than anything, as it ensures that user's don't see the interactive dpkg prompt for selecting an install location when they upgrade their grub2 packages. I will be available for quick iterations if you would like anything changed, just let me know. As for getting this SRU'd into Ubuntu releases, do I need to submit any patches, or do you take care of that? I read https://wiki.ubuntu.com/CloudinitUpdates and it seems you backport the current tree. I'm happy to help test regardless. |
I plan on reviewing this after lunch, so I'll let you know, thanks!
Yep, this would go back as part of a regular SRU. That said, we're about to start an SRU process and we wouldn't expect to do another one before the described deadline. I'm wondering, @blackboxsw, if we should hold on the SRU until this lands. @matthewruffell, what are the target Ubuntu releases that the SLA applies to? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the speedy turnaround, Matthew!
At a high-level, the main problem here is that we're shelling out to perform some string/list manipulation that would be much better performed in Python code. I've left some other comments where they made sense, but this'll need a re-review once it's all written in Python.
2759d58
to
4fa92fa
Compare
Again, thanks for the review and feedback @OddBloke, these iterations really do help improve code quality. Changelog:
Tested on AWS c5d.large (nitro) and t2.micro (xen) instances, LXD (with and without grub-common installed), KVM (BIOS and EFI). How to test:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, thanks for the review and feedback @OddBloke, these iterations really do help improve code quality.
Thank you for iterating!
Changelog:
* Removed subprocess calls to sort and awk, and implemented their string manipulation in pure Python
* Dropped status_cb from the subp() call for grub-probe
* We no longer have to resolve symlinks twice.
This is all good stuff, thanks! I have a few other inline comments/thoughts.
I will also want you to take a look at writing some unit tests for the module. I didn't mention this before, because I didn't want you to spend time writing tests for code that we were still iterating on, but now that I think we're beginning to settle on a form for the code that makes sense (assuming you're happy with my inline comment about refactoring, that is), I think it's worth mentioning.
Let me know if you'd like some guidance, I'd be happy to chat on IRC or in a video call. We have some documentation on what we expect of unit tests in our HACKING doc and test_resolv_conf.py should give you an idea of the preferred structure,
Oh, and as an aside: add your GH username to |
4fa92fa
to
f428cd8
Compare
Again, @OddBloke thanks for the review! These iterations have been great, and I can see my code getting better, smaller and more robust with each feedback cycle. To tell you the truth, I have never written a unit test in my life, but I followed the docs, looked at a lot of the existing examples and somehow managed to write one that works. It is very messy, and will likely need another revision. The tests are currently violating the 80 char line limits, and suggestions on how to fix that would be welcome. Changelog:
|
Replace the hardcoded list of devices with a more robust way of determining the device which grub is installed to. We use grub-probe to fetch the underlying disk the /boot directory is located on, and attempt to match the disk with its /dev/disk/by-id value. If no such /dev/disk/by-id/ value exists, we fallback to the plain disk name. The changes are robust to unstable kernel device names and ordering, and use /dev/disk/by-id values to populate grub-pc/install_devices where possible. LP: #1877491
Add a unit test for fetch_idevs() as a part of the refactoring effort for cc_grub_dpkg. There are 5 unit tests which correspond to common environments which cloud-init can be expected to run. LP: #1877491
f428cd8
to
9a4c244
Compare
Hi @OddBloke, please review my latest iteration. Changelog:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matthew, I agree that this is getting better and better. Thank you for your continued engagement in iterations, it's much appreciated!
(Comments inline. 👍)
Replace the hardcoded list of devices with a more robust way of determining the device which grub is installed to. We use grub-probe to fetch the underlying disk the /boot directory is located on, and attempt to match the disk with its /dev/disk/by-id value. If no such /dev/disk/by-id/ value exists, we fallback to the plain disk name. The changes are robust to unstable kernel device names and ordering, and use /dev/disk/by-id values to populate grub-pc/install_devices where possible. LP: #1877491
Add unit tests for fetch_idevs() and handle() as a part of the refactoring effort for cc_grub_dpkg. There are 11 tests which correspond to common environments which cloud-init can be expected to run. LP: #1877491
9a4c244
to
0fdad32
Compare
Hi @OddBloke, again thanks for the feedback. I have implemented the changes you have requested, please review this revision. Changelog:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good now; thank you! I've suggested some minor changes inline: I'm going to ask another cloud-init dev to review those, then will land them into your branch (and fix CI, which will potentially fail), and then land the branch. (There should be no need for input from you, enjoy your PTO. 👍)
cloudinit/config/cc_grub_dpkg.py
Outdated
# are not exposed to the container. | ||
elif "failed to get canonical path" in e.stderr: | ||
log.debug("Device mapping between /proc/self/mountinfo does not " | ||
"match /dev, e.g. inside a container") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does not attempt to compare /proc/self/mountinfo and host info; so I would not include this debug message; there may be other reasons for this error.
Replace the hardcoded list of devices with a more robust way of determining the device which grub is installed to. We use grub-probe to fetch the underlying disk the /boot directory is located on, and attempt to match the disk with its /dev/disk/by-id value. If no such /dev/disk/by-id/ value exists, we fallback to the plain disk name. The changes are robust to unstable kernel device names and ordering, and use /dev/disk/by-id values to populate grub-pc/install_devices where possible. LP: #1877491
Add unit tests for fetch_idevs() and handle() as a part of the refactoring effort for cc_grub_dpkg. There are 11 tests which correspond to common environments which cloud-init can be expected to run. LP: #1877491
cloudinit/config/cc_grub_dpkg.py
Outdated
|
||
try: | ||
# check if disk exists and use udevadm to fetch symlinks | ||
if disk and os.path.exists(disk): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If grub-probe -t returns OK, but does not provide a disk value (which this check seems to think is possible), should we exit earlier and just move this check up after the try/except for grub-probe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If grub-probe
fails, we'd have disk = ""
still from line 57. So it at least protects us from that. (I'm not 100% sure I've understood the request, so if this reply doesn't make sense, please do LMK.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm suggesting that we do this after the grub-probe call and before we try udev call.
if not disk:
return []
This handles either scenario I was thinking of: 1) grub-probe fails, we have disks = "" 2) grub-probe returns 0, but produces no standard-output, so disk = ""
If we don't have a disk value, we might as well just return []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha, right, I see what you mean. Good point, on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, this looks good.
Please read https://bugs.launchpad.net/cloud-init/+bug/1877491
Replace the hardcoded list of devices with a more robust way of determining
the device which grub is installed to.
We use grub-probe to fetch the underlying disk the /boot directory is located on, and attempt to match the disk with its /dev/disk/by-id value. If no such /dev/disk/by-id/ value exists, we fallback to the plain disk name.
The changes are robust to unstable kernel device names and ordering, and use
/dev/disk/by-id values to populate grub-pc/install_devices where possible.
LP: #1877491