Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4 ARM64 image broken #147

Closed
kroese opened this issue Oct 27, 2023 · 21 comments
Closed

v4 ARM64 image broken #147

kroese opened this issue Oct 27, 2023 · 21 comments

Comments

@kroese
Copy link
Contributor

kroese commented Oct 27, 2023

Quits with:

fatal: chdir(/usr/libexec/postfix): No such file or directory

@bokysan
Copy link
Owner

bokysan commented Oct 27, 2023

That was fast. It's not even out of the oven yet. I'll check it out.

@bokysan
Copy link
Owner

bokysan commented Oct 27, 2023

I cannot reproduce this issue.

docker run --platform linux/arm64 --pull always -e ALLOW_EMPTY_SENDER_DOMAINS=yes --rm -it boky/postfix:v4.0.0
v4.0.0: Pulling from boky/postfix
Digest: sha256:2a6d595d17c60f62571b7d25bd3b1e82f609eaa0eb6b625a80c53520487122eb
Status: Downloaded newer image for boky/postfix:v4.0.0
★★★★★ POSTFIX STARTING UP ★★★★★
‣ INFO  Not setting any timezone for the container
‣ INFO  Using plain log format for rsyslog.
‣ NOTE  Emails in the logs will not be anonymized. Set ANONYMIZE_EMAILS to enable this feature.
‣ DEBUG Reowning root: /var/spool/postfix/
‣ DEBUG Reowning root: /var/spool/postfix/pid/
‣ DEBUG Reowning postfix:postdrop /var/spool/postfix/private/
‣ DEBUG Reowing postfix:postdrop /var/spool/postfix/public/
‣ INFO  Preparing files for Postfix chroot:
        '/var/spool/postfix/usr/lib/zoneinfo/localtime' -> '/etc/localtime'
        '/etc/localtime' -> '/var/spool/postfix/etc/localtime'
        '/etc/nsswitch.conf' -> '/var/spool/postfix/etc/nsswitch.conf'
        '/etc/resolv.conf' -> '/var/spool/postfix/etc/resolv.conf'
        '/etc/services' -> '/var/spool/postfix/etc/services'
        '/etc/host.conf' -> '/var/spool/postfix/etc/host.conf'
        '/etc/hosts' -> '/var/spool/postfix/etc/hosts'
        '/etc/passwd' -> '/var/spool/postfix/etc/passwd'
‣ WARN  Detected old hash: and btree: references in the config file, which are not supported anymore. Upgrading to lmdb:
‣ DEBUG Creating new postalias for lmdb:/etc/aliases.
‣ DEBUG Creating new postalias for lmdb:/etc/aliases.
‣ INFO  Using unlimited message size.
‣ INFO  Setting smtp_tls_security_level: may
‣ NOTE  Will try to deliver emails directly to the final server. Make sure your DNS is setup properly!
/scripts/common-run.sh: line 341: pluginviewer: command not found
‣ INFO  Using default private network list for trusted networks.
‣ INFO  Debugging is disabled.
‣ DEBUG DKIM_AUTOGENERATE not set -- you will need to provide your own keys.
‣ INFO  No DKIM keys found, will not use DKIM.
‣ INFO  Applying custom postfix setting: message_size_limit=0
‣ INFO  Applying custom postfix setting: myhostname=8a3a412be30b
‣ INFO  Applying custom postfix setting: mynetworks=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
‣ INFO  Applying custom postfix setting: smtp_tls_security_level=may

@kroese
Copy link
Contributor Author

kroese commented Oct 27, 2023

I can reproduce this on the latest tag but not on the v4.0.0 tag. So try switching to latest

NEVERMIND: I also get this on the v4.0.0 tag:

2023-10-27T18:16:06.576840+02:00 CRIT    postfix[95]: fatal: chdir(/usr/libexec/postfix): No such file or directory
2023-10-27 18:16:07,577 WARN exited: postfix (exit status 1; not expected)
2023-10-27 18:16:08,579 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 2 seconds (startsecs)
2023-10-27 18:16:11,582 INFO success: opendkim entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2023-10-27T18:17:01.677718+02:00 ERR     CRON[121]: pam_env(cron:session): Unable to open env file: /etc/default/locale: No such file or directory
2023-10-27T18:17:01.677959+02:00 INFO    CRON[121]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
2023-10-27T18:17:01.679738+02:00 INFO    CRON[122]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
2023-10-27T18:17:01.683190+02:00 INFO    CRON[121]: pam_unix(cron:session): session closed for user root

@bokysan
Copy link
Owner

bokysan commented Oct 27, 2023

Works on my machine using the following command:

docker run --platform linux/arm64 --pull always -e ALLOW_EMPTY_SENDER_DOMAINS=yes --rm -it boky/postfix:latest
latest: Pulling from boky/postfix
Digest: sha256:2a6d595d17c60f62571b7d25bd3b1e82f609eaa0eb6b625a80c53520487122eb
Status: Downloaded newer image for boky/postfix:latest
....

How are you running it, what's your uname -a?

@kroese
Copy link
Contributor Author

kroese commented Oct 27, 2023

Uname is:

Linux 6.5.0-1011-oracle #11-Ubuntu SMP Fri Oct 20 20:17:12 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux

It's Ampere ARM64 in Oracle Cloud.

When I run:

docker run --platform linux/arm64 --pull always -e ALLOW_EMPTY_SENDER_DOMAINS=yes --rm -it boky/postfix:latest

on that machine, I dont see the error.

So it must have something to do with a setting in the compose-file?

@bokysan
Copy link
Owner

bokysan commented Oct 27, 2023

I'll try running this directly on another ARM machine. I'm running this natively on ARM mac and it works.

@kroese
Copy link
Contributor Author

kroese commented Oct 27, 2023

With your command it works.. but with this compose:

postfix:
    hostname: postfix  
    container_name: postfix
    image: "boky/postfix:v4.0.0"
    environment:
      - "HOSTNAME=domain.tld"   
      - "TZ=Europe/Amsterdam"
      - "MYNETWORKS=0.0.0.0/0,[::]/0"
      - "ALLOW_EMPTY_SENDER_DOMAINS=true"
      - "RELAYHOST=xxx:587"
      - "RELAYHOST_USERNAME=xxxx"
      - "RELAYHOST_PASSWORD=xxxx"
    volumes:
      - "/mnt/data/postfix/etc:/etc/postfix"
      - "/mnt/data/postfix/spool:/var/spool/postfix"
      - "/mnt/data/postfix/keys:/etc/opendkim/keys"
    ports:
      - 25:25
    networks:
      - web
    restart: always
    stop_grace_period: 2m

It does not.. So maybe setting the RELAYHOST variable is causing it?

@bokysan
Copy link
Owner

bokysan commented Oct 27, 2023

That shouldn't be an issue. It works for me, though, if I try to run it directly from the command line with your variables:

docker run --platform linux/arm64 --pull always -e HOSTNAME=domain.tld -e TZ=Europe/Amsterdam -e 'MYNETWORKS=0.0.0.0/0,[::]/0' -e ALLOW_EMPTY_SENDER_DOMAINS=yes -e RELAYHOST=xxx:587 -e RELAYHOST_USERNAME=xxxx -e RELAYHOST_PASSWORD=xxxx --rm -it boky/postfix:latest
latest: Pulling from boky/postfix
Digest: sha256:2a6d595d17c60f62571b7d25bd3b1e82f609eaa0eb6b625a80c53520487122eb
Status: Image is up to date for boky/postfix:latest
★★★★★ POSTFIX STARTING UP ★★★★★
‣ NOTE  Setting container timezone to: Europe/Amsterdam
‣ INFO  Using plain log format for rsyslog.
‣ NOTE  Emails in the logs will not be anonymized. Set ANONYMIZE_EMAILS to enable this feature.
‣ DEBUG Reowning root: /var/spool/postfix/
‣ DEBUG Reowning root: /var/spool/postfix/pid/
‣ DEBUG Reowning postfix:postdrop /var/spool/postfix/private/
‣ DEBUG Reowing postfix:postdrop /var/spool/postfix/public/
‣ INFO  Preparing files for Postfix chroot:
        '/var/spool/postfix/usr/lib/zoneinfo/localtime' -> '/etc/localtime'
        '/etc/localtime' -> '/var/spool/postfix/etc/localtime'
        '/etc/nsswitch.conf' -> '/var/spool/postfix/etc/nsswitch.conf'
        '/etc/resolv.conf' -> '/var/spool/postfix/etc/resolv.conf'
        '/etc/services' -> '/var/spool/postfix/etc/services'
        '/etc/host.conf' -> '/var/spool/postfix/etc/host.conf'
        '/etc/hosts' -> '/var/spool/postfix/etc/hosts'
        '/etc/passwd' -> '/var/spool/postfix/etc/passwd'
‣ WARN  Detected old hash: and btree: references in the config file, which are not supported anymore. Upgrading to lmdb:
‣ DEBUG Creating new postalias for lmdb:/etc/aliases.
‣ DEBUG Creating new postalias for lmdb:/etc/aliases.
‣ INFO  Using unlimited message size.
‣ INFO  Setting smtp_tls_security_level: may
‣ NOTE  Forwarding all emails to xxx:587 using username xxxx and password (redacted).
/scripts/common-run.sh: line 341: pluginviewer: command not found
‣ DEPRECATED! MYNETWORKS variable is deprecated. Please use POSTFIX_mynetworks instead.
‣ NOTE  Using custom allowed networks: 0.0.0.0/0,[::]/0
‣ INFO  Debugging is disabled.
‣ DEBUG DKIM_AUTOGENERATE not set -- you will need to provide your own keys.
‣ INFO  No DKIM keys found, will not use DKIM.
‣ INFO  Applying custom postfix setting: message_size_limit=0
‣ INFO  Applying custom postfix setting: myhostname=domain.tld
‣ INFO  Applying custom postfix setting: mynetworks=0.0.0.0/0,[::]/0
‣ INFO  Applying custom postfix setting: smtp_tls_security_level=may
‣ NOTE  Starting: rsyslog, postfix
2023-10-27 18:39:24,656 INFO Set uid to user 0 succeeded
2023-10-27 18:39:24,657 INFO supervisord started with pid 7
2023-10-27 18:39:25,661 INFO spawned: 'cron' with pid 475
2023-10-27 18:39:25,663 INFO spawned: 'opendkim' with pid 476
2023-10-27 18:39:25,664 INFO spawned: 'postfix' with pid 477
2023-10-27 18:39:25,665 INFO spawned: 'rsyslog' with pid 478
rsyslogd: could not load module 'omstdout.so', errors: trying to load module /usr/lib/aarch64-linux-gnu/rsyslog/omstdout.so: /usr/lib/aarch64-linux-gnu/rsyslog/omstdout.so: cannot open shared object file: No such file or directory [v8.2302.0 try https://www.rsyslog.com/e/2066 ]
2023-10-27 18:39:25,668 INFO success: cron entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-10-27 18:39:25,668 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-10-27T18:39:25.669193+02:00 ERR     rsyslogd: could not load module 'omstdout.so', errors: trying to load module /usr/lib/aarch64-linux-gnu/rsyslog/omstdout.so: /usr/lib/aarch64-linux-gnu/rsyslog/omstdout.so: cannot open shared object file: No such file or directory [v8.2302.0 try https://www.rsyslog.com/e/2066 ]
2023-10-27T18:39:25.669205+02:00 INFO    rsyslogd: [origin software="rsyslogd" swVersion="8.2302.0" x-pid="478" x-info="https://www.rsyslog.com"] start
2023-10-27T18:39:26.733362+02:00 INFO    postfix/postfix-script[566]: starting the Postfix mail system
2023-10-27T18:39:26.737175+02:00 INFO    postfix/master[567]: daemon started -- version 3.7.6, configuration /etc/postfix
2023-10-27 18:39:27,739 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 2 seconds (startsecs)
2023-10-27 18:39:30,747 INFO success: opendkim entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)

@undefinedid
Copy link

Same problem with amd64 build and latest tag image: boky/postfix:latest. Rollback to image: boky/postfix:v3.6.1 makes the stack back to healthy state. I got the same trace than those above with latest version :

2023-10-27 23:44:05,997 INFO Set uid to user 0 succeeded
2023-10-27 23:44:05,999 INFO supervisord started with pid 7
2023-10-27 23:44:07,002 INFO spawned: 'cron' with pid 126
2023-10-27 23:44:07,003 INFO spawned: 'opendkim' with pid 127
2023-10-27 23:44:07,005 INFO spawned: 'postfix' with pid 128
2023-10-27 23:44:07,007 INFO spawned: 'rsyslog' with pid 129
2023-10-27T23:44:07.017594+02:00 INFO    rsyslogd: [origin software="rsyslogd" swVersion="8.2302.0" x-pid="129" x-info="https://www.rsyslog.com"] start
2023-10-27T23:44:07.017876+02:00 CRIT    postfix[128]: fatal: chdir(/usr/libexec/postfix): No such file or directory
2023-10-27 23:44:07,018 INFO success: cron entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-10-27 23:44:07,018 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-10-27 23:44:08,018 WARN exited: postfix (exit status 1; not expected)
2023-10-27 23:44:09,020 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 2 seconds (startsecs)
2023-10-27 23:44:12,024 INFO success: opendkim entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)

@undefinedid
Copy link

Could this be because of the use of debian-slim image rather than debian mentioned in #149 ?

@kroese
Copy link
Contributor Author

kroese commented Oct 27, 2023

@undefinedid No... I experienced and reported this problem already before the move to debian-slim occured, so it is not related.

But a quick solution for you would be to use the latest-alpine tag instead of latest on arm64. That one works fine and is smaller too. The switch to Debian was made to support more exotic architectures, but has no advantage if you are running arm64.

@bokysan
Copy link
Owner

bokysan commented Oct 27, 2023

#149 was about changing debian to debian-slim, so no, it does not have anything to do with it.

But I think it does with switch from alpine to debian.

/usr/libexec/postfix does sound like something Alpine would have.

I don't think that it's an issue with the image, but with existing (Alpine-based) configuration.

I see above that @kroese has /etc/postfix bound to a folder on disk (although this is not recommended), I'm guessing you have the same.

Can you @undefinedid and @kroese check or maybe even share your configs, so we find in which setting folder appears?

@kroese
Copy link
Contributor Author

kroese commented Oct 27, 2023

@bokysan That must be it!

In main.cf it says:

daemon_directory = /usr/libexec/postfix

That is the same directory it complains about when starting.

Is it safe to just remove all the files inside the /etc/postfix directory? I never made any changes to them.

bokysan added a commit that referenced this issue Oct 29, 2023
This will change the `deamon_directory` if it detects a distribution
change. Some distributions will by default set the `deamon_directory` in
`main.cf` and they differentiate between distributions. Hence, updating
to another distribution causes Postfix to fail starting.
@undefinedid
Copy link

Can you @undefinedid and @kroese check or maybe even share your configs, so we find in which setting folder appears?

I can confirm that switching to image: boky/postfix:latest-alpine (build amd64) solved the problem! I don't have arm64 hardware easily accessible actually to test that build.

As @kroese said, /etc/postfix/main.cf include the path for the daemon: /usr/libexec/postfix.

It seems that this is the default Postfix config file, created at first launch of the container, in which custom settings are written at the very end of the file via the container environment variables.

Do you need a specific config or Compose file to help fix this issue?

@bokysan
Copy link
Owner

bokysan commented Oct 29, 2023

Can you check the latest edge, plese? It includes a fix for this issue, but I haven't been able to test it properly just yet.

If it does work, I'll release v4.0.1.

@undefinedid
Copy link

I'm note sure but it seems image: boky/postfix:edge and build amd64 is still bugged. Log is not the same but there is a trace:

2023-10-29T12:30:03.527122392Z ‣ ERROR Your daemon_directory is set to  but it does not exist. Postfix startup will most likely fail.

My /etc/postfix/main.cf looks like the other times:

daemon_directory = /usr/libexec/postfix

Sending mail still works and stack is seen "healthy". Here is the full log:

2023-10-29T12:30:00.554380166Z ★★★★★ POSTFIX STARTING UP ★★★★★
2023-10-29T12:30:00.554462960Z NOTE  Setting container timezone to: Europe/Paris
2023-10-29T12:30:00.555512890Z INFO  Using plain log format for rsyslog.
2023-10-29T12:30:00.557526367Z NOTE  Emails in the logs will not be anonymized. Set ANONYMIZE_EMAILS to enable this feature.
2023-10-29T12:30:00.572650490Z DEBUG Reowning root: /var/spool/postfix/
2023-10-29T12:30:00.573611571Z DEBUG Reowning root: /var/spool/postfix/pid/
2023-10-29T12:30:00.573671422Z DEBUG Reowning postfix:postdrop /var/spool/postfix/private/
2023-10-29T12:30:00.574477278Z DEBUG Reowing postfix:postdrop /var/spool/postfix/public/
2023-10-29T12:30:03.520628003Z INFO  Preparing files for Postfix chroot:
2023-10-29T12:30:03.521321082Z ln: /var/spool/postfix/usr/lib/zoneinfo: No such file or directory
2023-10-29T12:30:03.521892183Z         '/var/spool/postfix/usr/lib/zoneinfo' -> '/etc/localtime'
2023-10-29T12:30:03.524511705Z         '/etc/localtime' -> '/var/spool/postfix/etc'
2023-10-29T12:30:03.524519889Z         '/etc/nsswitch.conf' -> '/var/spool/postfix/etc'
2023-10-29T12:30:03.524533302Z         '/etc/resolv.conf' -> '/var/spool/postfix/etc'
2023-10-29T12:30:03.524535204Z         '/etc/services' -> '/var/spool/postfix/etc'
2023-10-29T12:30:03.524537043Z         '/etc/hosts' -> '/var/spool/postfix/etc'
2023-10-29T12:30:03.524538832Z         '/etc/passwd' -> '/var/spool/postfix/etc'
2023-10-29T12:30:03.525883465Z DEBUG No upgrade of hashes needed needed.
2023-10-29T12:30:03.527122392Z ERROR Your daemon_directory is set to  but it does not exist. Postfix startup will most likely fail.
2023-10-29T12:30:05.354983723Z INFO  Using unlimited message size.
2023-10-29T12:30:05.425591145Z NOTE  Forwarding all emails to [ssl0.ovh.net]:587 without any authentication. Make sure your server is configured to accept emails coming from this IP.
2023-10-29T12:30:05.435650554Z NOTE  Using custom allowed networks: 172.26.0.0/24 172.24.0.0/24
2023-10-29T12:30:05.435692149Z INFO  Debugging is disabled.
2023-10-29T12:30:05.438378425Z NOTE  Setting up smtp_header_checks to regexp:/etc/postfix/smtp_header_checks
2023-10-29T12:30:05.443837614Z DEBUG DKIM_AUTOGENERATE not set -- you will need to provide your own keys.
2023-10-29T12:30:05.444441314Z INFO  No DKIM keys found, will not use DKIM.
2023-10-29T12:30:05.449002632Z INFO  Applying custom postfix setting: append_dot_mydomain=no
2023-10-29T12:30:05.454095524Z INFO  Applying custom postfix setting: biff=no
2023-10-29T12:30:05.476524146Z INFO  Applying custom postfix setting: inet_interfaces=all
2023-10-29T12:30:05.481577071Z INFO  Applying custom postfix setting: inet_protocols=ipv4
2023-10-29T12:30:05.486718278Z INFO  Applying custom postfix setting: message_size_limit=0
2023-10-29T12:30:05.492063528Z INFO  Applying custom postfix setting: myhostname=relay-ovh
2023-10-29T12:30:05.497061990Z INFO  Applying custom postfix setting: mynetworks=172.26.0.0/24 172.24.0.0/24
2023-10-29T12:30:05.502192346Z INFO  Applying custom postfix setting: readme_directory=no
2023-10-29T12:30:05.507433874Z INFO  Applying custom postfix setting: relayhost=[ssl0.ovh.net]:587
2023-10-29T12:30:05.516787073Z INFO  Applying custom postfix setting: smtp_sasl_auth_enable=yes
2023-10-29T12:30:05.522378551Z INFO  Applying custom postfix setting: smtp_sasl_password_maps=lmdb:/etc/postfix/sasl_passwd
2023-10-29T12:30:05.528085233Z INFO  Applying custom postfix setting: smtp_sasl_security_options=noanonymous
2023-10-29T12:30:05.533400397Z INFO  Applying custom postfix setting: smtp_tls_security_level=encrypt
2023-10-29T12:30:05.538695305Z INFO  Applying custom postfix setting: smtp_use_tls=yes
2023-10-29T12:30:05.545921015Z INFO  Applying custom postfix setting: strict_rfc821_envelopes=no
2023-10-29T12:30:05.552015371Z NOTE  Starting: rsyslog, postfix
2023-10-29T12:30:05.687559019Z 2023-10-29 13:30:05,687 INFO Set uid to user 0 succeeded
2023-10-29T12:30:05.689295337Z 2023-10-29 13:30:05,689 INFO supervisord started with pid 1
2023-10-29T12:30:06.691606686Z 2023-10-29 13:30:06,691 INFO spawned: 'cron' with pid 349
2023-10-29T12:30:06.692978683Z 2023-10-29 13:30:06,692 INFO spawned: 'opendkim' with pid 350
2023-10-29T12:30:06.694055043Z 2023-10-29 13:30:06,693 INFO spawned: 'postfix' with pid 351
2023-10-29T12:30:06.695247365Z 2023-10-29 13:30:06,695 INFO spawned: 'rsyslog' with pid 352
2023-10-29T12:30:06.698664374Z 2023-10-29T13:30:06.698404+01:00 INFO    : [origin software="rsyslogd" swVersion="8.2306.0" x-pid="352" x-info="https://www.rsyslog.com"] start
2023-10-29T12:30:06.698713127Z 2023-10-29 13:30:06,698 INFO success: cron entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-10-29T12:30:06.698779103Z 2023-10-29 13:30:06,698 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-10-29T12:30:07.720918391Z 2023-10-29T13:30:07.720665+01:00 WARNING postfix/postfix-script[411]: warning: group or other writable: /etc/postfix/./makedefs.out
2023-10-29T12:30:07.726051524Z 2023-10-29T13:30:07.725877+01:00 INFO    postfix/postfix-script[422]: starting the Postfix mail system
2023-10-29T12:30:07.728977270Z 2023-10-29T13:30:07.728813+01:00 INFO    postfix/master[423]: daemon started -- version 3.8.2, configuration /etc/postfix
2023-10-29T12:30:08.730147238Z 2023-10-29 13:30:08,730 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 2 seconds (startsecs)
2023-10-29T12:30:11.733558794Z 2023-10-29 13:30:11,733 INFO success: opendkim entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2023-10-29T12:39:41.482155557Z 2023-10-29T13:39:41.481881+01:00 INFO    postfix/smtpd[599]: connect from unknown[172.26.0.254]
2023-10-29T12:39:41.488156142Z 2023-10-29T13:39:41.487865+01:00 INFO    postfix/smtpd[599]: 77160360E1B: client=unknown[172.26.0.254]
2023-10-29T12:39:41.532662512Z 2023-10-29T13:39:41.532378+01:00 INFO    postfix/cleanup[602]: 77160360E1B: message-id=<xxxxxxxxxxxxxxxxxxxxxxxxxxxx@xxxxxxx>
2023-10-29T12:39:41.534912801Z 2023-10-29T13:39:41.534743+01:00 INFO    postfix/qmgr[425]: 77160360E1B: from=<xxxxxxx@xxxxxxx.xxx>, size=500, nrcpt=1 (queue active)
2023-10-29T12:39:41.535327717Z 2023-10-29T13:39:41.535107+01:00 INFO    postfix/smtpd[599]: disconnect from unknown[172.26.0.254] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5
2023-10-29T12:39:42.318993973Z 2023-10-29T13:39:42.318718+01:00 INFO    postfix/smtp[603]: 77160360E1B: replace: header From: LibreNMS <xxxxxxx@xxxxxxx.xxx>: From: <xxxxxxx@xxxxxxx.xxx>
2023-10-29T12:39:42.355338666Z 2023-10-29T13:39:42.355057+01:00 INFO    postfix/smtp[603]: 77160360E1B: to=<xxxxxxx@xxxxxxx.xxx>, relay=ssl0.ovh.net[x.x.x.x]:587, delay=0.87, delays=0.05/0.06/0.53/0.23, dsn=2.0.0, status=sent (250 2.0.0 Ok: queued as 3D65F1FDBE)
2023-10-29T12:39:42.355490869Z 2023-10-29T13:39:42.355347+01:00 INFO    postfix/qmgr[425]: 77160360E1B: removed
2023-10-29T12:40:30.408494009Z 2023-10-29T13:40:30.408227+01:00 INFO    postfix/anvil[437]: statistics: max connection rate 3/60s for (submission:127.0.0.1) at Oct 29 13:31:30
2023-10-29T12:40:30.408516658Z 2023-10-29T13:40:30.408238+01:00 INFO    postfix/anvil[437]: statistics: max connection count 1 for (submission:127.0.0.1) at Oct 29 13:30:30
2023-10-29T12:40:30.408519424Z 2023-10-29T13:40:30.408243+01:00 INFO    postfix/anvil[437]: statistics: max cache size 1 at Oct 29 13:30:30

bokysan added a commit that referenced this issue Oct 29, 2023
This image will now automatically reconfigure `daemon_directory` in case
of distrubution switching.

Alpine and Debian/Ubuntu will use different directly layout and if the
user persists `/etc/postfix`, this image will not start if the
distribution changes.
@bokysan
Copy link
Owner

bokysan commented Oct 29, 2023

This should work now with v4.0.1. Closing the issue. If it reappears, please reopen.

@joekrill
Copy link
Contributor

joekrill commented Jan 8, 2024

I think there may have been a regression here. I'm seeing this happening again starting with v4.0.4. But if I downgrade to 4.0.3 the problem disappears.

@bokysan
Copy link
Owner

bokysan commented Jan 8, 2024

That sounds really odd. These were literary the only changes between v4.0.3 and v4.0.4:

diff --git a/Dockerfile b/Dockerfile
index bf92c5a..ae4c04f 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -1,4 +1,4 @@
-# syntax=docker/dockerfile:1.2
+# syntax=docker/dockerfile:1.6

 ARG BASE_IMAGE=debian:bookworm-slim
 # ARG BASE_IMAGE=ubuntu:jammy
@@ -61,7 +61,7 @@ VOLUME     [ "/var/spool/postfix", "/etc/postfix", "/etc/opendkim/keys" ]
 USER       root
 WORKDIR    /tmp

-HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 CMD printf "EHLO healthcheck\nquit\n" | { while read l ; do sleep 1; echo $l; done } | nc 127.0.0.1 587 -w 2 | grep -qE "^220.*ESMTP Postfix"
+HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --start-interval=2s --retries=3 CMD printf "EHLO healthcheck\nquit\n" | { while read l ; do sleep 1; echo $l; done } | nc -w 2 127.0.0.1 587 | grep -qE "^220.*ESMTP Postfix"

 EXPOSE     587
 CMD        [ "/bin/sh", "-c", "/scripts/run.sh" ]
diff --git a/build.sh b/build.sh
index 280104e..17814fc 100755
--- a/build.sh
+++ b/build.sh
@@ -41,5 +41,6 @@ fi
 #    arg_list="$arg_list --platform linux/amd64,linux/arm64,linux/arm/v7"
 #fi

-docker buildx build ${arg_list} $* .
+set -x
+exec docker buildx build ${arg_list} $* .

diff --git a/helm/mail/values.yaml b/helm/mail/values.yaml
index 2efa6a9..32b9ba9 100644
--- a/helm/mail/values.yaml
+++ b/helm/mail/values.yaml
@@ -211,7 +211,7 @@ readinessProbe:
       - -c
       - >-
         [ ! -f /tmp/container_is_terminating ] &&
-        printf "EHLO healthcheck\n" | nc 127.0.0.1 587 | grep -qE "^220.*ESMTP Postfix"
+        printf "EHLO healthcheck\nquit\n" | { while read l ; do sleep 1; echo $l; done } | nc -w 2 127.0.0.1 587 | grep -qE "^220.*ESMTP Postfix"
 livenessProbe:
   initialDelaySeconds: 5
   periodSeconds: 5

@bokysan
Copy link
Owner

bokysan commented Jan 8, 2024

@joekrill How exactly are you starting the image?

@joekrill
Copy link
Contributor

joekrill commented Jan 8, 2024

Sorry, I should clarify - the error I'm seeing is slightly different, so this may be a different issue. I'm seeing the same message as the one in this comment above - which is why I initially thought it was the same issue:

CRON[2779]: pam_env(cron:session): Unable to open env file: /etc/default/locale: No such file or directory

But I'm not seeing the fatal: chdir(/usr/libexec/postfix): No such file or directory message. But it does seem to be preventing the service from becoming healthy in my cluster. Let me know if I should create a separate issue for this.

The full output:

★★★★★ POSTFIX STARTING UP (debian) ★★★★★
‣ INFO  Not setting any timezone for the container
‣ INFO  Using plain log format for rsyslog.
‣ NOTE  Emails in the logs will not be anonymized. Set ANONYMIZE_EMAILS to enable this feature.
‣ DEBUG Reowning root: /var/spool/postfix/
‣ DEBUG Reowning root: /var/spool/postfix/pid/
‣ DEBUG Reowning postfix:postdrop /var/spool/postfix/private/
‣ DEBUG Reowning postfix:postdrop /var/spool/postfix/public/
‣ INFO  Preparing files for Postfix chroot:
ln: failed to create symbolic link '/var/spool/postfix/usr/lib/zoneinfo/': No such file or directory
        '/etc/localtime' -> '/var/spool/postfix/etc'
        '/etc/nsswitch.conf' -> '/var/spool/postfix/etc'
        '/etc/resolv.conf' -> '/var/spool/postfix/etc'
        '/etc/services' -> '/var/spool/postfix/etc'
        '/etc/host.conf' -> '/var/spool/postfix/etc'
        '/etc/hosts' -> '/var/spool/postfix/etc'
        '/etc/passwd' -> '/var/spool/postfix/etc'
‣ NOTE  Switching default_database_type to lmdb to ensure cross-distro compatibility.
‣ WARN  Detected old hash: and btree: references in the config file, which are not supported anymore. Upgrading to lmdb:
‣ DEBUG Creating new postalias for lmdb:/etc/aliases.
‣ DEBUG Creating new postalias for lmdb:/etc/aliases.
‣ INFO  Using unlimited message size.
‣ INFO  Setting smtp_tls_security_level: may
‣ NOTE  Forwarding all emails to email-smtp.us-east-1.amazonaws.com:587 using username (removed) and password (redacted).
‣ INFO  Using default private network list for trusted networks.
‣ INFO  Debugging is disabled.
‣ INFO  Setting up allowed SENDER domains: (removed)
‣ DEBUG DKIM_AUTOGENERATE not set -- you will need to provide your own keys.
‣ INFO  No DKIM keys found, will not use DKIM.
‣ INFO  Applying custom postfix setting: message_size_limit=0
‣ INFO  Applying custom postfix setting: myhostname=smtp-0
‣ INFO  Applying custom postfix setting: mynetworks=127.0.0.0/8,10.0.0.0/8,172.16.0.0/12,192.168.0.0/16
‣ INFO  Applying custom postfix setting: smtp_tls_security_level=may
‣ NOTE  Starting: rsyslog, crond, postfix
2024-01-08 22:03:02,092 INFO Set uid to user 0 succeeded
2024-01-08 22:03:02,099 INFO supervisord started with pid 7
2024-01-08 22:03:03,104 INFO spawned: 'cron' with pid 354
2024-01-08 22:03:03,108 INFO spawned: 'opendkim' with pid 355
2024-01-08 22:03:03,112 INFO spawned: 'postfix' with pid 356
2024-01-08 22:03:03,117 INFO spawned: 'rsyslog' with pid 357
2024-01-08T22:03:03.143135+00:00 INFO    rsyslogd: [origin software="rsyslogd" swVersion="8.2302.0" x-pid="357" x-info="https://www.rsyslog.com"] start
2024-01-08 22:03:03,144 INFO success: cron entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2024-01-08 22:03:03,144 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2024-01-08T22:03:03.596738+00:00 INFO    postfix/postfix-script[429]: starting the Postfix mail system
2024-01-08T22:03:03.608740+00:00 INFO    postfix/master[430]: daemon started -- version 3.7.9, configuration /etc/postfix
2024-01-08 22:03:05,611 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 2 seconds (startsecs)
2024-01-08 22:03:08,615 INFO success: opendkim entered RUNNING state, process has stayed up for > than 5 seconds (startsecs)
2024-01-08T22:17:01.130027+00:00 ERR     CRON[2779]: pam_env(cron:session): Unable to open env file: /etc/default/locale: No such file or directory
2024-01-08T22:17:01.131518+00:00 INFO    CRON[2779]: pam_unix(cron:session): session opened for user root(uid=0) by (uid=0)
2024-01-08T22:17:01.134201+00:00 INFO    CRON[2780]: (root) CMD (cd / && run-parts --report /etc/cron.hourly)
2024-01-08T22:17:01.138975+00:00 INFO    CRON[2779]: pam_unix(cron:session): session closed for user root
2024-01-08T22:23:03.242681+00:00 INFO    rsyslogd: -- MARK --

I'm using the Helm chart to start this in a Kubernetes cluster. These are the values I'm using:

fullnameOverride: smtp
metrics:
  enabled: true
  serviceMonitor:
    enabled: false
    labels:
      release: prometheus-stack
existingSecret: smtp-settings
config:
  general:
    ALLOWED_SENDER_DOMAINS: mydomain.com
service:
  port: 25

And the smtp-settings secrets contains values for RELAYHOST (specifically: email-smtp.us-east-1.amazonaws.com:587), RELAYHOST_USERNAME, and RELAYHOST_PASSWORD

If I downgrade to 4.0.3 the problem goes away. So perhaps it instead has something to do with the loop in the healthcheck causing the startup to hang?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants