Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filesystem space checks for /boot/ #1648

Open
dustymabe opened this issue Jun 26, 2018 · 7 comments
Open

filesystem space checks for /boot/ #1648

dustymabe opened this issue Jun 26, 2018 · 7 comments

Comments

@dustymabe
Copy link
Contributor

I hit a case today where I ran out of disk space on my /boot/ partition. This was mainly because I had pinned some deployments that I wanted to keep around, but I still ended up with a failure:

[dustymabe@dhcp137-98 logs]$ rpm-ostree upgrade
==== AUTHENTICATING FOR org.projectatomic.rpmostree1.upgrade ====
Authentication is required to update software
Authenticating as: Dusty Mabe (dustymabe)
Password:
==== AUTHENTICATION COMPLETE ====
6 delta parts, 4 loose fetched; 112481 KiB transferred in 31 seconds                                                                                                                                                                                            Checking out tree fbed0e2... done                                                                                         
Updating metadata for 'fedora': [=============] 100%
rpm-md repo 'fedora'; generated: 2018-04-25 04:27:32
Updating metadata for 'updates': [=============] 100%
rpm-md repo 'updates'; generated: 2018-06-25 10:46:00
Importing metadata [=============] 100%
Resolving dependencies... done
Will download: 7 packages (15.2 MB)
  Downloading from updates: [=============] 100%
Importing (7/7) [=============] 100%
Checking out packages (287/287) [=============] 100%
Running pre scripts... 0 done
Running post scripts... 7 done
Writing rpmdb... done
Writing OSTree commit... done
Copying /etc changes: 20 modified, 1 removed, 48 added
error: Installing kernel: regfile copy: No space left on device
[dustymabe@dhcp137-98 logs]$ sudo df -kh /boot/
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       283M  267M     0 100% /boot

Here is the status after the failure:

[dustymabe@dhcp137-98 logs]$ rpm-ostree status
State: idle; auto updates enabled (check; last run 15h ago)
Deployments:
● ostree://unifiedrepo:fedora/28/x86_64/atomic-host
                   Version: 28.20180613.0 (2018-06-13 13:52:10)
                BaseCommit: c51100f14cf12b25c16562cede7455191e536c0534e3b2ef87e66be9e12899ae
              GPGSignature: Valid signature by 128CF232A9371991C8A65695E08E7E629DB62FB1
           LayeredPackages: aria2 git git-annex mosh pciutils tig vim

  ostree://unifiedrepo:fedora/28/x86_64/updates/atomic-host
                   Version: 28.20180527.0 (2018-05-27 19:05:29)
                BaseCommit: 291ea90da29bc5abe757b5a50813b3de1396b08412939a89b3b671aba9856093
              GPGSignature: Valid signature by 128CF232A9371991C8A65695E08E7E629DB62FB1
           LayeredPackages: aria2 git git-annex mosh pciutils tig vim

  ostree://unifiedrepo:fedora/28/x86_64/atomic-host
                   Version: 28.20180515.1 (2018-05-15 16:32:35)
                BaseCommit: a29367c58417c28e2bd8306c1f438b934df79eba13706e078fe8564d9e0eb32b
              GPGSignature: Valid signature by 128CF232A9371991C8A65695E08E7E629DB62FB1
           LayeredPackages: aria2 git git-annex mosh pciutils tig vim weechat
                    Pinned: yes

  ostree://fedora-atomic-27:fedora/27/x86_64/atomic-host
                   Version: 27.122 (2018-04-18 23:34:24)
                BaseCommit: 931ebb3941fc49af706ac5a90ad3b5a493be4ae35e85721dabbfd966b1ecbf99
              GPGSignature: Valid signature by 860E19B0AFA800A1751881A6F55E7430F5282EE4
           LayeredPackages: aria2 git git-annex mosh pciutils tig vim weechat
                    Pinned: yes

Available update:
           Diff: 7 upgraded
[dustymabe@dhcp137-98 logs]$ rpm -q ostree rpm-ostree
ostree-2018.5-1.fc28.x86_64
rpm-ostree-2018.5-1.fc28.x86_64

I unpinned a deployment and ran a rpm-ostree cleanup -r, so I'm unblocked. This is something to consider, though.

@rfairley
Copy link
Member

rfairley commented Aug 9, 2018

Any update on this? I can look into this if it'd be handy to have. I imagine there would be some way to query the disk space available, then give an error if it is less than the size of the packages to download. @cgwalters what do you think of the complexity of this?

@dustymabe
Copy link
Contributor Author

I can look into this if it'd be handy to have.

thanks robert. Funny enough, I actually hit this again today. @cgwalters I know we just discussed that labeling the difficulty of tasks is arbitrary, but I figure I'll ask, do you think this is something @rfairley could pick up with some guidance?

@rfairley
Copy link
Member

Had a look into reproducing this - when pinning deployments with different BaseCommits i.e. after upgrading, I can see /boot go up to > 200MB used. I just need to figure out how to set the partition of the virtual machine to lower so that I can hit the max space used for /boot and reproduce the issue (right now when I spin up a F28AH Vagrant box the partition where /boot is mounted defaults to a size of 1GB).

@rfairley
Copy link
Member

rfairley commented Nov 27, 2018

Reproduced this after:

  1. vagrant init fedora/27-atomic-host && vagrant up && vagrant ssh
  2. follow instructions to shrink the boot partition to 128M
  3. # ostree admin pin 0
  4. try to upgrade to F28AH: # ostree remote add --set=gpgkeypath=/etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-28-primary fedora-atomic-28 https://kojipkgs.fedoraproject.org/atomic/repo/ && rpm-ostree rebase fedora-atomic-28:fedora/28/x86_64/atomic-host

Output is:

# rpm-ostree rebase fedora-atomic-28:fedora/28/x86_64/atomic-host

2956 metadata, 15046 content objects fetched; 426106 KiB transferred in 295 seconds              
Copying /etc changes: 21 modified, 0 removed, 57 added
error: Installing kernel: regfile copy: No space left on device

Now will look into adding a check. Both cases run into the error in postprocessing scripts I think, will start from there:

rpm-ostree-2018.9/src/libpriv/rpmostree-postprocess.c:1422:        return glnx_throw_errno_prefix (error, "regfile copy");
rpm-ostree-2018.9/src/libpriv/rpmostree-postprocess.c:1779:        return glnx_throw_errno_prefix (error, "regfile copy");
libglnx/glnx-fdio.c:942:    return glnx_throw_errno_prefix (error, "regfile copy");

@cgwalters
Copy link
Member

The rpm-ostree code isn't involved here - except when rpm-ostree initramfs --enable. This is a libostree issue.

Probably the best approach to fixing this would be to add the size of the kernel/initramfs as metadata on the commit object (along with the bootcsum).

@rfairley
Copy link
Member

rfairley commented Nov 27, 2018

Ah, thanks! Makes sense, now looking at src/libostree/ostree-sysroot-deploy.c. Adding to the metadata of the ostree commit (and checking the size of /boot against the kernel/initramfs size in the metadata) sounds good.

Can later see if a check at rpm-ostree initramfs --enable can/needs to be done.

rfairley added a commit to rfairley/ostree that referenced this issue Mar 14, 2019
Check free space on the filesystem before copying the kernel
executable, initramfs, and devicetree into the boot partition.

Both the get_kernel_from_tree_usrlib_modules() and
get_kernel_from_tree_legacy_layouts() code paths query the size
of the kernel, initramfs, and devicetree. If hardlinking fails,
then sum up what still needs to be copied, and fail if it is
greater than the disk space available.

Fixes: ostreedev#1648
@dustymabe
Copy link
Contributor Author

hmm. I wonder if #2847 means we can close this? or at least modify the scope.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants