Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container bricked after updating only glib2 and glibc* from F33 to F34, followed by 'dnf distrosync' #669

Closed
ondrejholy opened this issue Jan 14, 2021 · 8 comments
Labels
1. Bug Something isn't working

Comments

@ondrejholy
Copy link

Describe the bug
I am not able to enter my fedora-toolbox-33 container and I see:

$ toolbox enter
Error: failed to initialize container fedora-toolbox-33
$ toolbox enter
Error: invalid entry point PID of container fedora-toolbox-33`
$ toolbox enter --log-level debug
DEBU Running as real user ID 1000                 
DEBU Resolved absolute path to the executable as /usr/bin/toolbox 
DEBU Running on a cgroups v2 host                 
DEBU Checking if /etc/subgid and /etc/subuid have entries for user oholy 
DEBU TOOLBOX_PATH is /usr/bin/toolbox             
DEBU Toolbox config directory is /var/home/oholy/.config/toolbox 
DEBU Current Podman version is 2.2.1              
DEBU Creating runtime directory /run/user/1000/toolbox 
DEBU Old Podman version is 2.2.1                  
DEBU Migration not needed: Podman version 2.2.1 is unchanged 
DEBU Resolving container and image names          
DEBU Container: ''                                
DEBU Image: ''                                    
DEBU Release: ''                                  
DEBU Resolved container and image names           
DEBU Container: 'fedora-toolbox-33'               
DEBU Image: 'fedora-toolbox:33'                   
DEBU Release: '33'                                
DEBU Checking if container fedora-toolbox-33 exists 
DEBU Inspecting mounts of container fedora-toolbox-33 
DEBU Requires org.freedesktop.Flatpak.SessionHelper 
DEBU Calling org.freedesktop.Flatpak.SessionHelper.RequestSession 
DEBU Starting container fedora-toolbox-33         
DEBU Inspecting entry point of container fedora-toolbox-33 
DEBU Entry point PID is a float64                 
DEBU Entry point of container fedora-toolbox-33 is toolbox (PID=0) 
Error: invalid entry point PID of container fedora-toolbox-33
$ podman start --attach fedora-toolbox-33
{"msg":"exec container process (missing dynamic library?) `/usr/bin/toolbox`: No such file or directory","level":"error","time":"2021-01-14T11:56:28.000710987Z"}`

Output of toolbox --version (v0.0.90+)
toolbox version 0.0.98.1

Toolbox package info (rpm -q toolbox)
toolbox-0.0.98.1-1.fc33.x86_64

Output of podman version

Version:      2.2.1
API Version:  2.1.0
Go Version:   go1.15.5
Built:        Tue Dec  8 15:37:50 2020
OS/Arch:      linux/amd64

Podman package info (rpm -q podman)
podman-2.2.1-1.fc33.x86_64

Info about your OS
e.g., Fedora Silverblue 33

Additional context
This has started happening after I called dnf distrosync inside the container. I do not recall what all packages have been updated, unfortunately.

@ondrejholy ondrejholy added the 1. Bug Something isn't working label Jan 14, 2021
@ondrejholy
Copy link
Author

Btw I am pretty pissed off as the container contains the whole my development/packaging environment including custom repositories, kerberos configurations, certificates etc. I will probably have to learn how to backup containers, and/or create some automatization to set up the containers. Or move back to Workstation... Any recommendations?

@ondrejholy
Copy link
Author

I can't reproduce with dnf distrosync in another container, so I am not sure that it is related.

@debarshiray
Copy link
Member

This looks pretty catastrophic:

$ podman start --attach fedora-toolbox-33
{"msg":"exec container process (missing dynamic library?) `/usr/bin/toolbox`: No such file or directory","level":"error","time":"2021-01-14T11:56:28.000710987Z"}`

/usr/bin/toolbox is the binary inside the container that gets run on podman start. It seems like it's failing to start because of some ABI problem, because as a Go binary it shouldn't link to anything other than the C library.

You can look at the contents of the container like this:

$ podman unshare /bin/bash
# root=$(podman mount fedora-toolbox-33)
# cd "$root"

Note that /usr/bin/toolbox actually gets bind mounted from the host into the container, so it won't actually be there when you look inside the container as above.

I am wondering if the dnf distrosync blew away some library from the container by mistake.

You can also backup the container as a tarball and upload it somewhere. I can then try to poke at it myself.

@ondrejholy
Copy link
Author

Finally, I recalled what I was doing and found reproducer:

$ toolbox create test
$ toolbox enter test
$ sudo dnf distrosync
...
$ sudo dnf install fedora-repos-rawhide
...
$ sudo dnf update glib2 --nogpg --enabler=rawhide
Fedora - Rawhide - Developmental packages for the next Fedora release                                       5.4 MB/s |  73 MB     00:13    
Last metadata expiration check: 0:00:19 ago on Fri Jan 15 08:50:45 2021.
Dependencies resolved.
============================================================================================================================================
 Package                                   Architecture              Version                               Repository                  Size
============================================================================================================================================
Upgrading:
 glib2                                     x86_64                    2.67.1-2.fc34                         rawhide                    2.7 M
 glibc                                     x86_64                    2.32.9000-24.fc34                     rawhide                    3.5 M
 glibc-common                              x86_64                    2.32.9000-24.fc34                     rawhide                    2.1 M
 glibc-minimal-langpack                    x86_64                    2.32.9000-24.fc34                     rawhide                    108 k

Transaction Summary
============================================================================================================================================
Upgrade  4 Packages
...
$ sudo dnf distrosync
Last metadata expiration check: 0:04:06 ago on Fri Jan 15 08:48:04 2021.
Dependencies resolved.
================================================================================
 Package                     Arch        Version             Repository    Size
================================================================================
Downgrading:
 glib2                       x86_64      2.66.4-1.fc33       updates      2.7 M
 glibc                       x86_64      2.32-2.fc33         updates      3.5 M
 glibc-common                x86_64      2.32-2.fc33         updates      1.8 M
 glibc-minimal-langpack      x86_64      2.32-2.fc33         updates       81 k

Transaction Summary
================================================================================
Downgrade  4 Packages

Total download size: 8.1 M
Is this ok [y/N]: y
Downloading Packages:
(1/4): glibc-common-2.32-2.fc33.x86_64.rpm      2.4 MB/s | 1.8 MB     00:00    
(2/4): glibc-minimal-langpack-2.32-2.fc33.x86_6 1.7 MB/s |  81 kB     00:00    
(3/4): glib2-2.66.4-1.fc33.x86_64.rpm           2.4 MB/s | 2.7 MB     00:01    
(4/4): glibc-2.32-2.fc33.x86_64.rpm             2.6 MB/s | 3.5 MB     00:01    
--------------------------------------------------------------------------------
Total                                           4.2 MB/s | 8.1 MB     00:01     
Running transaction check
Transaction check succeeded.
Running transaction test
Transaction test succeeded.
Running transaction
  Preparing        :                                                        1/1 
  Downgrading      : glibc-common-2.32-2.fc33.x86_64                        1/8 
  Downgrading      : glibc-minimal-langpack-2.32-2.fc33.x86_64              2/8 
  Running scriptlet: glibc-2.32-2.fc33.x86_64                               3/8 
  Downgrading      : glibc-2.32-2.fc33.x86_64                               3/8 
  Running scriptlet: glibc-2.32-2.fc33.x86_64                               3/8 
  Downgrading      : glib2-2.66.4-1.fc33.x86_64                             4/8 
  Cleanup          : glib2-2.67.1-2.fc34.x86_64                             5/8 
  Cleanup          : glibc-2.32.9000-24.fc34.x86_64                         6/8 
  Cleanup          : glibc-minimal-langpack-2.32.9000-24.fc34.x86_64        7/8 
  Cleanup          : glibc-common-2.32.9000-24.fc34.x86_64                  8/8 
  Running scriptlet: glibc-common-2.32.9000-24.fc34.x86_64                  8/8 
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerin(man-db-2.9.2-6.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerpostun(glibc-common-2.32-2.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerpostun(man-db-2.9.2-6.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerpostun(glib2-2.66.4-1.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerpostun(glib2-2.66.4-1.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerin(glibc-common-2.32-2.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
error: failed to exec scriptlet interpreter /bin/sh: No such file or directory
warning: %transfiletriggerin(glib2-2.66.4-1.fc33.x86_64) scriptlet failed, exit status 127

Error in <unknown> scriptlet in rpm package glibc-common
  Verifying        : glib2-2.66.4-1.fc33.x86_64                             1/8 
  Verifying        : glib2-2.67.1-2.fc34.x86_64                             2/8 
  Verifying        : glibc-2.32-2.fc33.x86_64                               3/8 
  Verifying        : glibc-2.32.9000-24.fc34.x86_64                         4/8 
  Verifying        : glibc-common-2.32-2.fc33.x86_64                        5/8 
  Verifying        : glibc-common-2.32.9000-24.fc34.x86_64                  6/8 
  Verifying        : glibc-minimal-langpack-2.32-2.fc33.x86_64              7/8 
  Verifying        : glibc-minimal-langpack-2.32.9000-24.fc34.x86_64        8/8 

Downgraded:
  glib2-2.66.4-1.fc33.x86_64        glibc-2.32-2.fc33.x86_64                   
  glibc-common-2.32-2.fc33.x86_64   glibc-minimal-langpack-2.32-2.fc33.x86_64  

Complete!
bash: /usr/bin/sed: No such file or directory
bash: /usr/libexec/vte-urlencode-cwd: No such file or directory
$ exit
logout
$ toolbox enter test
Error: command /bin/bash not found in container test
Using /bin/bash instead.
exec: No such file or directory
$ podman stop test
2e05cc1cce6c334299bc44c99c40cbc808de0218f03f283efd362804bfeba7b2
$ toolbox enter test
Error: invalid entry point PID of container test

@debarshiray
Copy link
Member

Seems like you only updated the glib2 and glibc-* RPMs to F34 and came back to F33 and it basically bricked your container. eg., the RPM transaction is complaining that /bin/sh is absent, and such.

This seems like a RPM packaging issue (I don't know right now if it's expected behaviour or not).

All I can say right now is that it would be good to take snapshots (with podman container commit) of the container before doing something out of the ordinary. Maybe we should look at making snapshots more usable for Toolbox.

@ralfkaa
Copy link

ralfkaa commented Jan 29, 2021

any updates on this getting fixed?

@debarshiray debarshiray changed the title Can't enter my fedora-toolbox-33 container Container bricked after updating only glib2 and glibc* from F33 to F34, followed by 'dnf distrosync' Feb 3, 2021
@debarshiray
Copy link
Member

any updates on this getting fixed?

I doubt that we can actually fix this in Toolbox - other than somehow making snapshots easier to use; or tracking down the underlying RPM packaging issue and possibly making them more resilient.

@debarshiray
Copy link
Member

Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1. Bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants