Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

podman build fails DNS lookups under systemd-resolved; podman run works #3806

Closed
kousu opened this issue Mar 5, 2022 · 14 comments · Fixed by #3986
Closed

podman build fails DNS lookups under systemd-resolved; podman run works #3806

kousu opened this issue Mar 5, 2022 · 14 comments · Fixed by #3986

Comments

@kousu
Copy link

kousu commented Mar 5, 2022

Description

podman build has broken DNS on a system that only uses systemd-resolved. But podman run works.

This seems to be the same issue as #2780, but triggered in a novel way.

Steps to reproduce the issue:

  1. Use iwd + systemd-resolve:

    There's probably a simpler way to do this, but the way I have in front of me involves iwd. If I figure out a simpler reproduction you can be sure I'll include it below :)

    Disable /etc/resolv.conf:

    echo | sudo tee /etc/resolv.conf
    

    Enable DHCP

    # /etc/iwd/main.conf
    [General]
    EnableNetworkConfiguration=true
    

    Connect to a WiFi network:

    systemctl enable --now iwd
    iwctl station wlan0 scan
    iwctl station wlan0 connect $SSID  # will ask for the WiFi password
    
  2. Confirm that systemd-resolve --status reports DNS servers.

  3. Confirm DNS is working outside of any container:

    $ ping -c 1 dl-cdn.alpinelinux.org
    
  4. Try to build any container that needs to connect to the network:

    # Containerfile
    FROM alpine
    
    RUN apk update
    
    podman build .
    

Describe the results you received:

$ systemd-resolve --status
Global
           Protocols: +LLMNR +mDNS -DNSOverTLS DNSSEC=no/unsupported
    resolv.conf mode: foreign
Fallback DNS Servers: 9.9.9.9#dns.quad9.net 8.8.8.8#dns.google
                      2620:fe::9#dns.quad9.net 2001:4860:4860::8888#dns.google

Link 2 (enp0s25)
Current Scopes: none
     Protocols: -DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported

Link 7 (wlan0)
    Current Scopes: DNS LLMNR/IPv4
         Protocols: +DefaultRoute +LLMNR -mDNS -DNSOverTLS DNSSEC=no/unsupported
Current DNS Server: 192.168.9.1
       DNS Servers: 192.168.9.1
        DNS Domain: lan
$ ping -c 1 dl-cdn.alpinelinux.org
PING dualstack.d.sni.global.fastly.net (151.101.138.133) 56(84) bytes of data.
64 bytes from 151.101.138.133 (151.101.138.133): icmp_seq=1 ttl=55 time=12.7 ms

--- dualstack.d.sni.global.fastly.net ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 12.693/12.693/12.693/0.000 ms
$ podman build .
STEP 1/2: FROM alpine
STEP 2/2: RUN apk update
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/main/x86_64/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.15/main: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.15/main: No such file or directory
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/community/x86_64/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.15/community: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.15/community: No such file or directory
2 errors; 14 distinct packages available
Error: error building at STEP "RUN apk update": error while running runtime: exit status 2

Compare with the identical container made with podman run:

$ podman run --rm -it alpine apk update
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/community/x86_64/APKINDEX.tar.gz
v3.15.0-323-g25bd02cc14 [https://dl-cdn.alpinelinux.org/alpine/v3.15/main]
v3.15.0-324-geacd65e2fb [https://dl-cdn.alpinelinux.org/alpine/v3.15/community]
OK: 15859 distinct packages available

Describe the results you expected:

podman run and podman build should have identical environments, and the build should succeed.

Output of rpm -q buildah or apt list buildah:

I don't seem to have buildah installed?

$ pacman -Qi buildah
error: package 'buildah' was not found

I skimmed strace and it looks like buildah has been compiled into podman. Is that possible?

Output of buildah version:

$ buildah version
bash: buildah: command not found

Output of podman version if reporting a podman build issue:

$ podman version
Version:      3.4.4
API Version:  3.4.4
Go Version:   go1.17.4
Git Commit:   f6526ada1025c2e3f88745ba83b8b461ca659933
Built:        Thu Dec  9 13:30:40 2021
OS/Arch:      linux/amd64

Output of cat /etc/*release:

$ cat /etc/*release
NAME="Arch Linux"
PRETTY_NAME="Arch Linux"
ID=arch
BUILD_ID=rolling
ANSI_COLOR="38;2;23;147;209"
HOME_URL="https://archlinux.org/"
DOCUMENTATION_URL="https://wiki.archlinux.org/"
SUPPORT_URL="https://bbs.archlinux.org/"
BUG_REPORT_URL="https://bugs.archlinux.org/"
LOGO=archlinux-logo

Output of uname -a:

$ uname -a
Linux laptop 5.16.11-arch1-1 #1 SMP PREEMPT Thu, 24 Feb 2022 02:18:20 +0000 x86_64 GNU/Linux

Output of cat /etc/containers/storage.conf:

# This file is is the configuration file for all tools
# that use the containers/storage library.
# See man 5 containers-storage.conf for more information
# The "container storage" table contains all of the server options.
[storage]

# Default Storage Driver, Must be set for proper operation.
driver = "overlay"

# Temporary storage location
runroot = "/run/containers/storage"

# Primary Read/Write location of container storage
graphroot = "/var/lib/containers/storage"

# Storage path for rootless users
#
# rootless_storage_path = "$HOME/.local/share/containers/storage"

[storage.options]
# Storage options to be passed to underlying storage drivers

# AdditionalImageStores is used to pass paths to additional Read/Only image stores
# Must be comma separated list.
additionalimagestores = [
]

# Remap-UIDs/GIDs is the mapping from UIDs/GIDs as they should appear inside of
# a container, to the UIDs/GIDs as they should appear outside of the container,
# and the length of the range of UIDs/GIDs.  Additional mapped sets can be
# listed and will be heeded by libraries, but there are limits to the number of
# mappings which the kernel will allow when you later attempt to run a
# container.
#
# remap-uids = 0:1668442479:65536
# remap-gids = 0:1668442479:65536

# Remap-User/Group is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid or /etc/subgid file.  Mappings are set up starting
# with an in-container ID of 0 and then a host-level ID taken from the lowest
# range that matches the specified name, and using the length of that range.
# Additional ranges are then assigned, using the ranges which specify the
# lowest host-level IDs first, to the lowest not-yet-mapped in-container ID,
# until all of the entries have been used for maps.
#
# remap-user = "containers"
# remap-group = "containers"

# Root-auto-userns-user is a user name which can be used to look up one or more UID/GID
# ranges in the /etc/subuid and /etc/subgid file.  These ranges will be partitioned
# to containers configured to create automatically a user namespace.  Containers
# configured to automatically create a user namespace can still overlap with containers
# having an explicit mapping set.
# This setting is ignored when running as rootless.
# root-auto-userns-user = "storage"
#
# Auto-userns-min-size is the minimum size for a user namespace created automatically.
# auto-userns-min-size=1024
#
# Auto-userns-max-size is the minimum size for a user namespace created automatically.
# auto-userns-max-size=65536

[storage.options.overlay]
# ignore_chown_errors can be set to allow a non privileged user running with
# a single UID within a user namespace to run containers. The user can pull
# and use any image even those with multiple uids.  Note multiple UIDs will be
# squashed down to the default uid in the container.  These images will have no
# separation between the users in the container. Only supported for the overlay
# and vfs drivers.
#ignore_chown_errors = "false"

# Inodes is used to set a maximum inodes of the container image.
# inodes = ""

# Path to an helper program to use for mounting the file system instead of mounting it
# directly.
#mount_program = "/usr/bin/fuse-overlayfs"

# mountopt specifies comma separated list of extra mount options
mountopt = "nodev"

# Set to skip a PRIVATE bind mount on the storage home directory.
# skip_mount_home = "false"

# Size is used to set a maximum size of the container image.
# size = ""

# ForceMask specifies the permissions mask that is used for new files and
# directories.
#
# The values "shared" and "private" are accepted.
# Octal permission masks are also accepted.
#
#  "": No value specified.
#     All files/directories, get set with the permissions identified within the
#     image.
#  "private": it is equivalent to 0700.
#     All files/directories get set with 0700 permissions.  The owner has rwx
#     access to the files. No other users on the system can access the files.
#     This setting could be used with networked based homedirs.
#  "shared": it is equivalent to 0755.
#     The owner has rwx access to the files and everyone else can read, access
#     and execute them. This setting is useful for sharing containers storage
#     with other users.  For instance have a storage owned by root but shared
#     to rootless users as an additional store.
#     NOTE:  All files within the image are made readable and executable by any
#     user on the system. Even /etc/shadow within your image is now readable by
#     any user.
#
#   OCTAL: Users can experiment with other OCTAL Permissions.
#
#  Note: The force_mask Flag is an experimental feature, it could change in the
#  future.  When "force_mask" is set the original permission mask is stored in
#  the "user.containers.override_stat" xattr and the "mount_program" option must
#  be specified. Mount programs like "/usr/bin/fuse-overlayfs" present the
#  extended attribute permissions to processes within containers rather then the
#  "force_mask"  permissions.
#
# force_mask = ""

[storage.options.thinpool]
# Storage Options for thinpool

# autoextend_percent determines the amount by which pool needs to be
# grown. This is specified in terms of % of pool size. So a value of 20 means
# that when threshold is hit, pool will be grown by 20% of existing
# pool size.
# autoextend_percent = "20"

# autoextend_threshold determines the pool extension threshold in terms
# of percentage of pool size. For example, if threshold is 60, that means when
# pool is 60% full, threshold has been hit.
# autoextend_threshold = "80"

# basesize specifies the size to use when creating the base device, which
# limits the size of images and containers.
# basesize = "10G"

# blocksize specifies a custom blocksize to use for the thin pool.
# blocksize="64k"

# directlvm_device specifies a custom block storage device to use for the
# thin pool. Required if you setup devicemapper.
# directlvm_device = ""

# directlvm_device_force wipes device even if device already has a filesystem.
# directlvm_device_force = "True"

# fs specifies the filesystem type to use for the base device.
# fs="xfs"

# log_level sets the log level of devicemapper.
# 0: LogLevelSuppress 0 (Default)
# 2: LogLevelFatal
# 3: LogLevelErr
# 4: LogLevelWarn
# 5: LogLevelNotice
# 6: LogLevelInfo
# 7: LogLevelDebug
# log_level = "7"

# min_free_space specifies the min free space percent in a thin pool require for
# new device creation to succeed. Valid values are from 0% - 99%.
# Value 0% disables
# min_free_space = "10%"

# mkfsarg specifies extra mkfs arguments to be used when creating the base
# device.
# mkfsarg = ""

# metadata_size is used to set the `pvcreate --metadatasize` options when
# creating thin devices. Default is 128k
# metadata_size = ""

# Size is used to set a maximum size of the container image.
# size = ""

# use_deferred_removal marks devicemapper block device for deferred removal.
# If the thinpool is in use when the driver attempts to remove it, the driver
# tells the kernel to remove it as soon as possible. Note this does not free
# up the disk space, use deferred deletion to fully remove the thinpool.
# use_deferred_removal = "True"

# use_deferred_deletion marks thinpool device for deferred deletion.
# If the device is busy when the driver attempts to delete it, the driver
# will attempt to delete device every 30 seconds until successful.
# If the program using the driver exits, the driver will continue attempting
# to cleanup the next time the driver is used. Deferred deletion permanently
# deletes the device and all data stored in device will be lost.
# use_deferred_deletion = "True"

# xfs_nospace_max_retries specifies the maximum number of retries XFS should
# attempt to complete IO when ENOSPC (no space) error is returned by
# underlying storage device.
# xfs_nospace_max_retries = "0"
@kousu kousu changed the title podman-build network fails DNS lookups when using systemd-resolve; podman-run works podman-build network fails DNS lookups when using systemd-resolved; podman-run works Mar 5, 2022
@kousu
Copy link
Author

kousu commented Mar 5, 2022

The difference is pretty simple: podman run generates an /etc/resolv.conf while podman build does not:

$ podman run --rm alpine sh -c "echo "/etc/resolv.conf:"; echo --------------; cat /etc/resolv.conf; echo --------------; apk update"
/etc/resolv.conf:
--------------
nameserver 10.0.2.3
nameserver 8.8.8.8
nameserver 8.8.4.4
--------------
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/community/x86_64/APKINDEX.tar.gz
v3.15.0-323-g25bd02cc14 [https://dl-cdn.alpinelinux.org/alpine/v3.15/main]
v3.15.0-324-geacd65e2fb [https://dl-cdn.alpinelinux.org/alpine/v3.15/community]
OK: 15859 distinct packages available
$ cat Containerfile 
FROM alpine

RUN echo "/etc/resolv.conf:"; echo --------------; cat /etc/resolv.conf; echo --------------; apk update
$ 
$ podman build .
STEP 1/2: FROM alpine
STEP 2/2: RUN echo "/etc/resolv.conf:"; echo --------------; cat /etc/resolv.conf; echo --------------; apk update
/etc/resolv.conf:
--------------
--------------
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/main/x86_64/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.15/main: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.15/main: No such file or directory
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/community/x86_64/APKINDEX.tar.gz
ERROR: https://dl-cdn.alpinelinux.org/alpine/v3.15/community: temporary error (try again later)
WARNING: Ignoring https://dl-cdn.alpinelinux.org/alpine/v3.15/community: No such file or directory
2 errors; 14 distinct packages available
Error: error building at STEP "RUN echo "/etc/resolv.conf:"; echo --------------; cat /etc/resolv.conf; echo --------------; apk update": error while running runtime: exit status 2

Notice that podman run is not respecting systemd-resolve either: it's actually just falling back to Google:

https://github.com/containers/podman/blob/f4d6e8777213880204ccbce92201c47c74b33036/pkg/resolvconf/resolvconf.go#L23-L25

This means the easiest workaround is to pass --dns:

$ podman build --dns=8.8.8.8 .
STEP 1/2: FROM alpine
STEP 2/2: RUN echo "/etc/resolv.conf:"; echo --------------; cat /etc/resolv.conf; echo --------------; apk update
/etc/resolv.conf:
--------------
nameserver 8.8.8.8
--------------
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.15/community/x86_64/APKINDEX.tar.gz
v3.15.0-323-g25bd02cc14 [https://dl-cdn.alpinelinux.org/alpine/v3.15/main]
v3.15.0-325-gc46c0ad247 [https://dl-cdn.alpinelinux.org/alpine/v3.15/community]
OK: 15859 distinct packages available
COMMIT
--> 64c0273e0c1
64c0273e0c1ba635d6825c06a521f8aaa2220f0643003e747caa61755d44d37a

@kousu
Copy link
Author

kousu commented Mar 5, 2022

After thinking about it, I think buildah is doing the better thing, and podman run shouldn't be covering up system config problems.

The problem here is my empty /etc/resolv.conf. Even other parts of my system are broken, like

$ dig github.com
;; communications error to ::1#53: connection refused

I guess systemd-resolved really expects itself to be configured it as a static nameserver:

$ echo 'nameserver 127.0.0.53' | sudo tee /etc/resolv.conf

With that in place, DNS works the same in all containers.

So my issue here is really that podman build and podman run have different behaviour.

@kousu
Copy link
Author

kousu commented Mar 5, 2022

This comment is very enlightening to me! https://github.com/containers/buildah/pull/3424/files#r684621757

@kousu kousu changed the title podman-build network fails DNS lookups when using systemd-resolved; podman-run works podman build fails DNS lookups under systemd-resolved; podman run works Mar 5, 2022
@rhatdan
Copy link
Member

rhatdan commented Mar 7, 2022

@Luap99 PTAL

@Luap99
Copy link
Member

Luap99 commented Mar 7, 2022

I think buildah should match podman, ideally we would use the same code for both projects so it should be moved into c/common.

@kousu
Copy link
Author

kousu commented Mar 7, 2022

I think buildah should match podman, ideally we would use the same code for both projects so it should be moved into c/common.

That would be great!

@github-actions
Copy link

github-actions bot commented Apr 7, 2022

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented Apr 7, 2022

@Luap99 is this part of the same /etc/hosts effort?

@Luap99
Copy link
Member

Luap99 commented Apr 7, 2022

No that would be different work but I will try to get to it after my hosts work.

@github-actions
Copy link

github-actions bot commented May 8, 2022

A friendly reminder that this issue had no activity for 30 days.

@rhatdan
Copy link
Member

rhatdan commented May 9, 2022

@Luap99 did you get a chance to look at this?

@Luap99
Copy link
Member

Luap99 commented May 9, 2022

working on it right now

@Luap99
Copy link
Member

Luap99 commented May 9, 2022

/assign

@kousu
Copy link
Author

kousu commented May 10, 2022

Thank you @Luap99 for following up :)

Luap99 added a commit to Luap99/buildah that referenced this issue May 12, 2022
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.

[NO NEW TESTS NEEDED] All existing tests should continue to pass.

Fixes containers#3806

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/buildah that referenced this issue May 12, 2022
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.

[NO NEW TESTS NEEDED] All existing tests should continue to pass.

Fixes containers#3806

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/buildah that referenced this issue May 12, 2022
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.

[NO NEW TESTS NEEDED] All existing tests should continue to pass.

Fixes containers#3806

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/buildah that referenced this issue May 25, 2022
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.

[NO NEW TESTS NEEDED] All existing tests should continue to pass.

Fixes containers#3806

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/buildah that referenced this issue Jun 7, 2022
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.

[NO NEW TESTS NEEDED] All existing tests should continue to pass.

Fixes containers#3806

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
Luap99 added a commit to Luap99/buildah that referenced this issue Jun 8, 2022
Podman and Buildah should use the same code the generate the resolv.conf
file. This mostly moved the podman code into c/common and created a
better API for it so buildah can use it as well.

[NO NEW TESTS NEEDED] All existing tests should continue to pass.

Fixes containers#3806

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants