Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

obtaining certificate: context canceled, but does not restart #3202

Closed
shinenelson opened this issue Mar 29, 2020 · 23 comments
Closed

obtaining certificate: context canceled, but does not restart #3202

shinenelson opened this issue Mar 29, 2020 · 23 comments
Labels
bug 🐞 Something isn't working
Milestone

Comments

@shinenelson
Copy link

shinenelson commented Mar 29, 2020

I'm trying to automate a caddy deployment and since I'm using multiple sub-domains, I'd like to get a wildcard TLS certificate.

In order to provision wildcard certificates, I need to use DNS-01 challenge. And I did not want to put the API keys to my domain registrar on a static file that would be on the file system on the server.
Hence, I put in a dummy key and then loaded the actual key manually via the caddy API.

However, the problem is that as soon as the caddy server starts, it starts the certificate maintenance routine which would fail with acme: error presenting token: digitalocean: HTTP 401: unauthorized.

Now, after I put in the correct API key for the registrar in via the caddy API, the caddy server reloads again. And this time when the certificate maintenance routine starts, it notices that there is already another obtaining certificate context and kills it.

log
-- Logs begin at Sat 2020-03-28 22:04:26 UTC, end at Sun 2020-03-29 18:50:54 UTC. --
Mar 28 22:07:28 systemd[1]: Started Caddy Web Server.
Mar 28 22:07:28 caddy[21276]: caddy.HomeDir=/var/lib/caddy
Mar 28 22:07:28 caddy[21276]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Mar 28 22:07:28 caddy[21276]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Mar 28 22:07:28 caddy[21276]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Mar 28 22:07:28 caddy[21276]: runtime.GOOS=linux
Mar 28 22:07:28 caddy[21276]: runtime.GOARCH=amd64
Mar 28 22:07:28 caddy[21276]: runtime.Compiler=gc
Mar 28 22:07:28 caddy[21276]: runtime.NumCPU=1
Mar 28 22:07:28 caddy[21276]: runtime.GOMAXPROCS=1
Mar 28 22:07:28 caddy[21276]: runtime.Version=go1.14
Mar 28 22:07:28 caddy[21276]: os.Getwd=/
Mar 28 22:07:28 caddy[21276]: LANG=C.UTF-8
Mar 28 22:07:28 caddy[21276]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Mar 28 22:07:28 caddy[21276]: HOME=/var/lib/caddy
Mar 28 22:07:28 caddy[21276]: LOGNAME=caddy
Mar 28 22:07:28 caddy[21276]: USER=caddy
Mar 28 22:07:28 caddy[21276]: INVOCATION_ID=cd51f6d2b270456882c64abf5851abfd
Mar 28 22:07:28 caddy[21276]: JOURNAL_STREAM=9:51261
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.8972936,"msg":"no autosave file exists","autosave_file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.8984275,"msg":"using provided configuration","config_file":"/var/lib/caddy/.config/caddy/caddy.json","config_adapter":""}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9018965,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9043777,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9058797,"logger":"tls","msg":"cleaned up storage units"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9075916,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.909218,"msg":"serving initial configuration"}
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][cache:0xc000660be0] Started certificate maintenance routine
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231393
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [WARN] [*.lola.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.lola.caddy.test.shinenelson.xyz.'
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231548
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231393
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231624
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:33 caddy[21276]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:33 caddy[21276]:  (challenge=dns-01 remaining=[])
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [WARN] [*.mark.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.mark.caddy.test.shinenelson.xyz.'
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [WARN] [*.shine.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.shine.caddy.test.shinenelson.xyz.'
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231624
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231548
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:34 caddy[21276]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:34 caddy[21276]:  (challenge=dns-01 remaining=[])
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:34 caddy[21276]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:34 caddy[21276]:  (challenge=dns-01 remaining=[])
Mar 28 22:07:35 caddy[21276]: 2020/03/28 22:07:35 [ERROR] attempt 1: [*.lola.caddy.test.shinenelson.xyz] Obtain: [*.lola.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:35 caddy[21276]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:35 caddy[21276]:  - retrying in 1m0s (6.672599298s/720h0m0s elapsed)...
Mar 28 22:07:36 caddy[21276]: 2020/03/28 22:07:36 [ERROR] attempt 1: [*.mark.caddy.test.shinenelson.xyz] Obtain: [*.mark.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:36 caddy[21276]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:36 caddy[21276]:  - retrying in 1m0s (7.660701675s/720h0m0s elapsed)...
Mar 28 22:07:36 caddy[21276]: 2020/03/28 22:07:36 [ERROR] attempt 1: [*.shine.caddy.test.shinenelson.xyz] Obtain: [*.shine.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:36 caddy[21276]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:36 caddy[21276]:  - retrying in 1m0s (7.729894166s/720h0m0s elapsed)...
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1349359,"logger":"admin.api","msg":"received request","method":"PATCH","uri":"/id/dns-challenge","remote_addr":"127.0.0.1:45044","headers":{"Accept":["*/*"],"Content-Length":["139"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1362393,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1371884,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc000660be0] Stopped certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1380074,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1384685,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc0007fca50] Started certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.shine.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.mark.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.lola.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:08:48 caddy[21276]: {"level":"error","ts":1585433328.1396031,"logger":"http.log.error","msg":"making dial info: upstream localhost: invalid dial address localhost: address localhost: missing port in address","request":{"method":"GET","uri":"/","proto":"HTTP/1.1","remote_addr":"183.82.162.3:55114","host":"a.shine.caddy.test.shinenelson.xyz","headers":{"Accept":["*/*"],"User-Agent":["curl/7.69.0-DEV"]}}}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8840406,"logger":"admin.api","msg":"received request","method":"PATCH","uri":"/id/proxy_shine","remote_addr":"127.0.0.1:45048","headers":{"Accept":["*/*"],"Content-Length":["103"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8849828,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8881805,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][cache:0xc0007fca50] Stopped certificate maintenance routine
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][cache:0xc00072b360] Started certificate maintenance routine
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.9491303,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:09:09 caddy[21276]: {"level":"info","ts":1585433349.3899143,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253163
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253201
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253348
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:16 caddy[21276]: 2020/03/28 22:09:16 [INFO] [*.shine.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:16 caddy[21276]: 2020/03/28 22:09:16 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.lola.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO] [*.shine.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO][*.shine.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:09:20 caddy[21276]: 2020/03/28 22:09:20 [INFO] [*.mark.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:20 caddy[21276]: 2020/03/28 22:09:20 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO] [*.lola.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO][*.lola.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO] [*.mark.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO][*.mark.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock

The problem is that the certificate maintenance routine does not retry obtaining the certificate again. I'll have to reload the caddy server again before it'll try again and generate the TLS certificates ( if the API keys are correct ) using the DNS challenge.

@mholt
Copy link
Member

mholt commented Mar 29, 2020

In order to provision wildcard certificates, I need to use DNS-01 challenge. And I did not want to put the API keys to my domain registrar on a static file that would be on the file system on the server.
Hence, I put in a dummy key and then loaded the actual key manually via the caddy API.

Just so you know, configs are saved to the file system regardless:

Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9075916,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}

So if your API key is in the config, it will be in that file. You can turn this off, but be aware of the implications: if Caddy can't save the last active config, it can't resume it, so any config changes made via the API won't be persisted if the server is restarted.

Is your machine really so loosely secured that other users/untrusted code can read files with 0600 permissions, i.e. they share the same user account? That's probably what you need to fix instead.

Anyway, I'd like to help you, but I'll need to reproduce the behavior you're seeing. Please provide a full and minimal config file needed to reproduce the problem, along with the exact steps to take.

Ideally, we need to be able to reproduce the bug in the most minimal way possible. This allows us to write regression tests to verify the fix is working. If we can't reproduce it, then you'll have to test our changes for us until it's fixed -- and then we can't add test cases, either.

I've attached a template below that will help make this easier and faster! It will ask for some information you've already provided; that's OK, just fill it out the best you can. 👍

I've also included some helpful tips below the template. Feel free to let me know if you have any questions!

Thank you again for your report, we look forward to resolving it!

Template

## 1. Environment

### 1a. Operating system and version

```
paste here
```


### 1b. Caddy version (run `caddy version` or paste commit SHA)

```
paste here
```


### 1c. Go version (if building Caddy from source; run `go version`)

```
paste here
```


## 2. Description

### 2a. What happens (briefly explain what is wrong)




### 2b. Why it's a bug (if it's not obvious)




### 2c. Log output

```
paste terminal output or logs here
```



### 2d. Workaround(s)




### 2e. Relevant links




## 3. Tutorial (minimal steps to reproduce the bug)




Helpful tips

  1. Environment: Please fill out your OS and Caddy versions, even if you don't think they are relevant. (They are always relevant.) If you built Caddy from source, provide the commit SHA and specify your exact Go version.

  2. Description: Describe at a high level what the bug is. What happens? Why is it a bug? Not all bugs are obvious, so convince readers that it's actually a bug.

    • 2c) Log output: Paste terminal output and/or complete logs in a code block. DO NOT REDACT INFORMATION except for credentials.
    • 2d) Workaround: What are you doing to work around the problem in the meantime? This can help others who encounter the same problem, until we implement a fix.
    • 2e) Relevant links: Please link to any related issues, pull requests, docs, and/or discussion. This can add crucial context to your report.
  3. Tutorial: What are the minimum required specific steps someone needs to take in order to experience the same bug? Your goal here is to make sure that anyone else can have the same experience with the bug as you do. You are writing a tutorial, so make sure to carry it out yourself before posting it. Please:

    • Start with an empty config. Add only the lines/parameters that are absolutely required to reproduce the bug.
    • Do not run Caddy inside containers.
    • Run Caddy manually in your terminal; do not use systemd or other init systems.
    • If making HTTP requests, avoid web browsers. Use a simpler HTTP client instead, like curl.
    • Do not redact any information from your config (except credentials). Domain names are public knowledge and often necessary for quick resolution of an issue!
    • Note that ignoring this advice may result in delays, or even in your issue being closed. 😞 Only actionable issues are kept open, and if there is not enough information or clarity to reproduce the bug, then the report is not actionable.

Example of a tutorial:

Create a config file:
{ ... }

Open terminal and run Caddy:

$ caddy ...

Make an HTTP request:

$ curl ...

Notice that the result is ___ but it should be ___.

@mholt mholt added the needs info 📭 Requires more information label Mar 29, 2020
@shinenelson
Copy link
Author

oops. I'm sorry I didn't put the configuration in. That was my bad. Here you go.

{
  "apps": {
    "http": {
      "servers": {
        "ssh-proxy": {
          "automatic_https": {
            "@id": "automatic_https",
            "disable_redirects": true,
            "skip": [
            ]
          },
          "listen": [
            ":80",
            ":443"
          ],
          "routes": [
            {
              "@id": "shine",
              "group": "shine",
              "handle": [
                {
                  "@id": "proxy_shine",
                  "handler": "reverse_proxy",
                  "upstreams": [
                    {
                      "dial": "localhost"
                    }
                  ]
                }
              ],
              "match": [
                {
                  "host": [
                    "*.shine.caddy.test.shinenelson.xyz"
                  ]
                }
              ],
              "terminal": true
            },
            {
              "@id": "mark",
              "group": "mark",
              "handle": [
                {
                  "@id": "proxy_mark",
                  "handler": "reverse_proxy",
                  "upstreams": [
                    {
                      "dial": "localhost"
                    }
                  ]
                }
              ],
              "match": [
                {
                  "host": [
                    "*.mark.caddy.test.shinenelson.xyz"
                  ]
                }
              ],
              "terminal": true
            },
            {
              "@id": "lola",
              "group": "lola",
              "handle": [
                {
                  "@id": "proxy_lola",
                  "handler": "reverse_proxy",
                  "upstreams": [
                    {
                      "dial": "localhost"
                    }
                  ]
                }
              ],
              "match": [
                {
                  "host": [
                    "*.lola.caddy.test.shinenelson.xyz"
                  ]
                }
              ],
              "terminal": true
            }
          ],
          "tls_connection_policies": [
            {
              "match": {
                "sni": [
                  "*.shine.caddy.test.shinenelson.xyz",
                  "*.mark.caddy.test.shinenelson.xyz",
                  "*.lola.caddy.test.shinenelson.xyz"
                ]
              }
            }
          ]
        }
      }
    },
    "tls": {
      "automation": {
        "policies": [
          {
            "issuer": {
              "ca" : "https://acme-staging-v02.api.letsencrypt.org/directory",
              "challenges": {
                "dns": {
                  "@id": "dns-challenge",
                  "auth_token": "dummmydigitaloceantoken",
                  "provider": "digitalocean"
                }
              },
              "module": "acme"
            },
            "subjects": [
              "*.shine.caddy.test.shinenelson.xyz",
              "*.mark.caddy.test.shinenelson.xyz",
              "*.lola.caddy.test.shinenelson.xyz"
            ]
          }
        ]
      }
    }
  }
}

Just so you know, configs are saved to the file system regardless
So if your API key is in the config, it will be in that file.

Is your machine really so loosely secured that other users/untrusted code can read files with 0600 permissions, i.e. they share the same user account? That's probably what you need to fix instead.

I knew this, but I didn't bother too much since autosave.json is also within caddy's home directory which is accessible only to the caddy user ( perms 640 ), but the 'running config' file was in /etc and is accessible to all users.

I could possibly move the running config also to within caddy's home directory, but then I would require to switch as the caddy user to be able to do so ( extra effort; I really don't want to be running as root for provisioning )

You can turn this off, but be aware of the implications: if Caddy can't save the last active config, it can't resume it, so any config changes made via the API won't be persisted if the server is restarted.

I'm not 100% sure, but for me, using the systemd init script, the configuration was never persisted. It always started afresh from the running config. I'm guessing that it is the --config flag in the systemd script. Maybe I should try removing it (?)
Then I'll have to load that configuration only once. I'm anyway loading the configuration via the API at least once for the API keys; I might as well push up the entire configuration from scratch.

@mholt
Copy link
Member

mholt commented Mar 29, 2020

Thanks, but what about the rest of the template? What curl and caddy commands do I run, what should I see, what do you expect, etc?

@shinenelson
Copy link
Author

Steps to Reproduce

  1. Run caddy via systemd : systemctl start caddy
    this runs /usr/local/bin/caddy run --config /etc/caddy.json --resum --environ
    where /etc/caddy.json is the JSON provided in my previous comment.
log after this command
-- Logs begin at Sat 2020-03-28 22:04:26 UTC, end at Sun 2020-03-29 18:50:54 UTC. --
Mar 28 22:07:28 systemd[1]: Started Caddy Web Server.
Mar 28 22:07:28 caddy[21276]: caddy.HomeDir=/var/lib/caddy
Mar 28 22:07:28 caddy[21276]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Mar 28 22:07:28 caddy[21276]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Mar 28 22:07:28 caddy[21276]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Mar 28 22:07:28 caddy[21276]: runtime.GOOS=linux
Mar 28 22:07:28 caddy[21276]: runtime.GOARCH=amd64
Mar 28 22:07:28 caddy[21276]: runtime.Compiler=gc
Mar 28 22:07:28 caddy[21276]: runtime.NumCPU=1
Mar 28 22:07:28 caddy[21276]: runtime.GOMAXPROCS=1
Mar 28 22:07:28 caddy[21276]: runtime.Version=go1.14
Mar 28 22:07:28 caddy[21276]: os.Getwd=/
Mar 28 22:07:28 caddy[21276]: LANG=C.UTF-8
Mar 28 22:07:28 caddy[21276]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Mar 28 22:07:28 caddy[21276]: HOME=/var/lib/caddy
Mar 28 22:07:28 caddy[21276]: LOGNAME=caddy
Mar 28 22:07:28 caddy[21276]: USER=caddy
Mar 28 22:07:28 caddy[21276]: INVOCATION_ID=cd51f6d2b270456882c64abf5851abfd
Mar 28 22:07:28 caddy[21276]: JOURNAL_STREAM=9:51261
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.8972936,"msg":"no autosave file exists","autosave_file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.8984275,"msg":"using provided configuration","config_file":"/var/lib/caddy/.config/caddy/caddy.json","config_adapter":""}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9018965,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9043777,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9058797,"logger":"tls","msg":"cleaned up storage units"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.9075916,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:28 caddy[21276]: {"level":"info","ts":1585433248.909218,"msg":"serving initial configuration"}
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][cache:0xc000660be0] Started certificate maintenance routine
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:07:28 caddy[21276]: 2020/03/28 22:07:28 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:07:31 caddy[21276]: 2020/03/28 22:07:31 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231393
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [WARN] [*.lola.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.lola.caddy.test.shinenelson.xyz.'
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231548
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:32 caddy[21276]: 2020/03/28 22:07:32 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231393
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231624
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:33 caddy[21276]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:33 caddy[21276]:  (challenge=dns-01 remaining=[])
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [WARN] [*.mark.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.mark.caddy.test.shinenelson.xyz.'
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:07:33 caddy[21276]: 2020/03/28 22:07:33 [WARN] [*.shine.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.shine.caddy.test.shinenelson.xyz.'
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231624
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621231548
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:34 caddy[21276]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:34 caddy[21276]:  (challenge=dns-01 remaining=[])
Mar 28 22:07:34 caddy[21276]: 2020/03/28 22:07:34 [ERROR] acme: Error -> One or more domains had a problem:
Mar 28 22:07:34 caddy[21276]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:34 caddy[21276]:  (challenge=dns-01 remaining=[])
Mar 28 22:07:35 caddy[21276]: 2020/03/28 22:07:35 [ERROR] attempt 1: [*.lola.caddy.test.shinenelson.xyz] Obtain: [*.lola.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:35 caddy[21276]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:35 caddy[21276]:  - retrying in 1m0s (6.672599298s/720h0m0s elapsed)...
Mar 28 22:07:36 caddy[21276]: 2020/03/28 22:07:36 [ERROR] attempt 1: [*.mark.caddy.test.shinenelson.xyz] Obtain: [*.mark.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:36 caddy[21276]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:36 caddy[21276]:  - retrying in 1m0s (7.660701675s/720h0m0s elapsed)...
Mar 28 22:07:36 caddy[21276]: 2020/03/28 22:07:36 [ERROR] attempt 1: [*.shine.caddy.test.shinenelson.xyz] Obtain: [*.shine.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Mar 28 22:07:36 caddy[21276]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Mar 28 22:07:36 caddy[21276]:  - retrying in 1m0s (7.729894166s/720h0m0s elapsed)...
  1. Upload the API key to the configuration

curl -sSL -H 'Content-Type: application/json' -d '{ "@id" : "dns-challenge", "auth_token" : "[REDACTED]", "provider" : "digitalocean" }' -X PATCH localhost:2019/id/dns-challenge"

log after this command
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1349359,"logger":"admin.api","msg":"received request","method":"PATCH","uri":"/id/dns-challenge","remote_addr":"127.0.0.1:45044","headers":{"Accept":["*/*"],"Content-Length":["139"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1362393,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1371884,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc000660be0] Stopped certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1380074,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1384685,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc0007fca50] Started certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.shine.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.mark.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [ERROR] *.lola.caddy.test.shinenelson.xyz: obtaining certificate: context canceled

A new certificate maintenance routine is not created for at least a minute.

  1. Patch some other un-relevant configuration so that the server reloads

curl -sSL -X PATCH -H 'Content-Type: application/json' -d '{ "@id": "proxy_shine", "handler": "reverse_proxy", "upstreams": [ { "dial": "localhost" } ] }' localhost:2019/id/proxy_shine"

log after this command
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8840406,"logger":"admin.api","msg":"received request","method":"PATCH","uri":"/id/proxy_shine","remote_addr":"127.0.0.1:45048","headers":{"Accept":["*/*"],"Content-Length":["103"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8849828,"logger":"admin","msg":"admin endpoint started","address":"localhost:2019","enforce_origin":false,"origins":["localhost:2019"]}
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.8881805,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.shine.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][cache:0xc0007fca50] Stopped certificate maintenance routine
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][cache:0xc00072b360] Started certificate maintenance routine
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Mar 28 22:09:08 caddy[21276]: 2020/03/28 22:09:08 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Mar 28 22:09:08 caddy[21276]: {"level":"info","ts":1585433348.9491303,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:09:09 caddy[21276]: {"level":"info","ts":1585433349.3899143,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253163
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:09 caddy[21276]: 2020/03/28 22:09:09 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253201
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/3621253348
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:10 caddy[21276]: 2020/03/28 22:09:10 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Mar 28 22:09:11 caddy[21276]: 2020/03/28 22:09:11 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Mar 28 22:09:16 caddy[21276]: 2020/03/28 22:09:16 [INFO] [*.shine.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:16 caddy[21276]: 2020/03/28 22:09:16 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.lola.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:18 caddy[21276]: 2020/03/28 22:09:18 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO] [*.shine.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO][*.shine.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:19 caddy[21276]: 2020/03/28 22:09:19 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:09:20 caddy[21276]: 2020/03/28 22:09:20 [INFO] [*.mark.caddy.test.shinenelson.xyz] The server validated our request
Mar 28 22:09:20 caddy[21276]: 2020/03/28 22:09:20 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO] [*.lola.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO][*.lola.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Mar 28 22:09:21 caddy[21276]: 2020/03/28 22:09:21 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO] [*.mark.caddy.test.shinenelson.xyz] Server responded with a certificate.
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO][*.mark.caddy.test.shinenelson.xyz] Certificate obtained successfully
Mar 28 22:09:22 caddy[21276]: 2020/03/28 22:09:22 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock

Then the new certificates are provisioned and everything goes about as expected.

PS : systemctl reload caddy would still not reload from the autosave.json in the default caddy.service unless the --config flag is removed.

Expected Behaviour

After the previous context is killed, a new context for the current configuration should start.

What Happens Instead

The new context for the server maintenance routine is not started ( at least for 1 minute )

@mholt
Copy link
Member

mholt commented Mar 30, 2020

Thanks for the explanation. I still don't quite follow what is going on. Can you reduce the problem space down more minimally? Simplify the configs, the commands to run, remove systemd from the equation, etc -- I know it's work but please put some effort into it and follow the issue template's suggestions, it will help speed things up incredibly.

A new certificate maintenance routine is not created for at least a minute.

Are you sure?

Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc000660be0] Stopped certificate maintenance routine
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1380074,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Mar 28 22:07:50 caddy[21276]: {"level":"info","ts":1585433270.1384685,"logger":"admin","msg":"stopped previous server"}
Mar 28 22:07:50 caddy[21276]: 2020/03/28 22:07:50 [INFO][cache:0xc0007fca50] Started certificate maintenance routine

^ same second.

PS : systemctl reload caddy would still not reload from the autosave.json in the default caddy.service unless the --config flag is removed.

Hmm, that doesn't sound right... how can I reproduce this? Having troubles seeing that.

@shinenelson
Copy link
Author

shinenelson commented Apr 1, 2020 via email

@mholt
Copy link
Member

mholt commented Apr 2, 2020

Okay. Will close in the meantime, until the issue becomes actionable then.

@mholt mholt closed this as completed Apr 2, 2020
@shinenelson
Copy link
Author

I know I messed up the issue template the last time ( unfortunately, it somehow doesn't exist in the repository or come up while creating a new issue. I had to copy-paste it from an earlier comment ) and went all over the place, so this time, I'm going to be a little bit more diligent with the issue template at least.

I'm going to try and collate everything I said previously into this one comment so that you don't have to scroll back and forth for context.

1. Environment

1a. Operating system and version

$ lsb_release --all

Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.4 LTS
Release:	18.04
Codename:	bionic

1b. Caddy version (run caddy version or paste commit SHA)

latest master 68cebb2 + tls.dns.digitalocean

1c. Go version (if building Caddy from source; run go version)

go version go1.14 linux/amd64

2. Description

2a. What happens (briefly explain what is wrong)

If a certificate maintenance routine is in a wait context and the server is reloaded, the new instance of the server kills the existing context, but does not start a new one.

2b. Why it's a bug (if it's not obvious)

While the killing of the previous obtain certificate context is desirable, I don't think the new server not starting a new context is the expected behavior.

2c. Log output

I'll merge the log along with the Steps to Reproduce so that I have the flexibility of breaking the logs by steps making it further comprehensible otherwise the log output might be confusing to understand what I'm talking about.

2d. Initial configuration

PS : This is the configuration that the server is started with. The configuration is updated later via the API.

{
  "apps": {
    "http": {
      "servers": {
        "ssh-proxy": {
          "automatic_https": {
            "@id": "automatic_https",
            "disable_redirects": true,
            "skip": [
            ]
          },
          "listen": [
            ":80",
            ":443"
          ],
          "routes": [
            {
              "@id": "shine",
              "group": "shine",
              "handle": [
                {
                  "@id": "proxy_shine",
                  "handler": "reverse_proxy",
                  "upstreams": [
                    {
                      "dial": "localhost"
                    }
                  ]
                }
              ],
              "match": [
                {
                  "host": [
                    "*.shine.caddy.test.shinenelson.xyz"
                  ]
                }
              ],
              "terminal": true
            },
            {
              "@id": "mark",
              "group": "mark",
              "handle": [
                {
                  "@id": "proxy_mark",
                  "handler": "reverse_proxy",
                  "upstreams": [
                    {
                      "dial": "localhost"
                    }
                  ]
                }
              ],
              "match": [
                {
                  "host": [
                    "*.mark.caddy.test.shinenelson.xyz"
                  ]
                }
              ],
              "terminal": true
            },
            {
              "@id": "lola",
              "group": "lola",
              "handle": [
                {
                  "@id": "proxy_lola",
                  "handler": "reverse_proxy",
                  "upstreams": [
                    {
                      "dial": "localhost"
                    }
                  ]
                }
              ],
              "match": [
                {
                  "host": [
                    "*.lola.caddy.test.shinenelson.xyz"
                  ]
                }
              ],
              "terminal": true
            }
          ],
          "tls_connection_policies": [
            {
              "match": {
                "sni": [
                  "*.shine.caddy.test.shinenelson.xyz",
                  "*.mark.caddy.test.shinenelson.xyz",
                  "*.lola.caddy.test.shinenelson.xyz"
                ]
              }
            }
          ]
        }
      }
    },
    "tls": {
      "automation": {
        "policies": [
          {
            "issuer": {
              "ca" : "https://acme-staging-v02.api.letsencrypt.org/directory",
              "challenges": {
                "dns": {
                  "@id": "dns-challenge",
                  "auth_token": "dummmydigitaloceantoken",
                  "provider": "digitalocean"
                }
              },
              "module": "acme"
            },
            "subjects": [
              "*.shine.caddy.test.shinenelson.xyz",
              "*.mark.caddy.test.shinenelson.xyz",
              "*.lola.caddy.test.shinenelson.xyz"
            ]
          }
        ]
      }
    }
  }
}

3. Steps to Reproduce

  1. Start caddy server with above configuration file
    $ caddy run --environ --config /var/lib/caddy/.config/caddy/caddy.json
Log
-- Logs begin at Mon 2020-04-13 11:21:44 UTC, end at Mon 2020-04-13 12:54:16 UTC. --
Apr 13 11:54:38 systemd[1]: Started Caddy.
Apr 13 11:54:38 caddy[21625]: caddy.HomeDir=/var/lib/caddy
Apr 13 11:54:38 caddy[21625]: caddy.AppDataDir=/var/lib/caddy/.local/share/caddy
Apr 13 11:54:38 caddy[21625]: caddy.AppConfigDir=/var/lib/caddy/.config/caddy
Apr 13 11:54:38 caddy[21625]: caddy.ConfigAutosavePath=/var/lib/caddy/.config/caddy/autosave.json
Apr 13 11:54:38 caddy[21625]: runtime.GOOS=linux
Apr 13 11:54:38 caddy[21625]: runtime.GOARCH=amd64
Apr 13 11:54:38 caddy[21625]: runtime.Compiler=gc
Apr 13 11:54:38 caddy[21625]: runtime.NumCPU=1
Apr 13 11:54:38 caddy[21625]: runtime.GOMAXPROCS=1
Apr 13 11:54:38 caddy[21625]: runtime.Version=go1.14
Apr 13 11:54:38 caddy[21625]: os.Getwd=/
Apr 13 11:54:38 caddy[21625]: LANG=C.UTF-8
Apr 13 11:54:38 caddy[21625]: PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
Apr 13 11:54:38 caddy[21625]: HOME=/var/lib/caddy
Apr 13 11:54:38 caddy[21625]: LOGNAME=caddy
Apr 13 11:54:38 caddy[21625]: USER=caddy
Apr 13 11:54:38 caddy[21625]: INVOCATION_ID=4f0cb22fc5794aa69f3f0ae44789b757
Apr 13 11:54:38 caddy[21625]: JOURNAL_STREAM=9:52612
Apr 13 11:54:38 caddy[21771]: {"level":"info","ts":1586778878.319228,"msg":"using provided configuration","config_file":"/var/lib/caddy/.config/caddy/caddy.json","config_adapter":""}
Apr 13 11:54:38 caddy[21625]: {"level":"info","ts":1586778878.3214922,"logger":"admin.api","msg":"received request","method":"POST","host":"localhost:2019","uri":"/load","remote_addr":"127.0.0.1:50830","headers":{"Accept-Encoding":["gzip"],"Content-Length":["7025"],"Content-Type":["application/json"],"Origin":["localhost:2019"],"User-Agent":["Go-http-client/1.1"]}}
Apr 13 11:54:38 caddy[21625]: {"level":"info","ts":1586778878.322522,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
Apr 13 11:54:38 caddy[21625]: {"level":"info","ts":1586778878.3232386,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.mark.caddy.test.shinenelson.xyz","*.shine.caddy.test.shinenelson.xyz","*.lola.caddy.test.shinenelson.xyz"]}
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][cache:0xc000479770] Stopped certificate maintenance routine
Apr 13 11:54:38 caddy[21625]: {"level":"info","ts":1586778878.3240125,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Apr 13 11:54:38 caddy[21625]: {"level":"info","ts":1586778878.3241887,"logger":"admin.api","msg":"load complete"}
Apr 13 11:54:38 caddy[21625]: {"level":"info","ts":1586778878.32462,"logger":"admin","msg":"stopped previous server"}
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][cache:0xc000674820] Started certificate maintenance routine
Apr 13 11:54:38 systemd[1]: Reloaded Caddy.
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Apr 13 11:54:38 caddy[21625]: 2020/04/13 11:54:38 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 11:54:39 caddy[21625]: 2020/04/13 11:54:39 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043559
Apr 13 11:54:39 caddy[21625]: 2020/04/13 11:54:39 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 11:54:39 caddy[21625]: 2020/04/13 11:54:39 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043562
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043563
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [WARN] [*.shine.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.shine.caddy.test.shinenelson.xyz.'
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [WARN] [*.lola.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.lola.caddy.test.shinenelson.xyz.'
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [WARN] [*.mark.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.mark.caddy.test.shinenelson.xyz.'
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043559
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043562
Apr 13 11:54:40 caddy[21625]: 2020/04/13 11:54:40 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043563
Apr 13 11:54:41 caddy[21625]: 2020/04/13 11:54:41 [ERROR] acme: Error -> One or more domains had a problem:
Apr 13 11:54:41 caddy[21625]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:41 caddy[21625]:  (challenge=dns-01 remaining=[])
Apr 13 11:54:41 caddy[21625]: 2020/04/13 11:54:41 [ERROR] acme: Error -> One or more domains had a problem:
Apr 13 11:54:41 caddy[21625]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:41 caddy[21625]:  (challenge=dns-01 remaining=[])
Apr 13 11:54:41 caddy[21625]: 2020/04/13 11:54:41 [ERROR] acme: Error -> One or more domains had a problem:
Apr 13 11:54:41 caddy[21625]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:41 caddy[21625]:  (challenge=dns-01 remaining=[])
Apr 13 11:54:41 caddy[21625]: 2020/04/13 11:54:41 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043560
Apr 13 11:54:41 caddy[21625]: 2020/04/13 11:54:41 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043561
Apr 13 11:54:43 caddy[21625]: 2020/04/13 11:54:43 [ERROR] attempt 1: [*.shine.caddy.test.shinenelson.xyz] Obtain: [*.shine.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:54:43 caddy[21625]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:43 caddy[21625]:  - retrying in 1m0s (4.705162976s/720h0m0s elapsed)...
Apr 13 11:54:43 caddy[21625]: 2020/04/13 11:54:43 [ERROR] attempt 1: [*.lola.caddy.test.shinenelson.xyz] Obtain: [*.lola.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:54:43 caddy[21625]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:43 caddy[21625]:  - retrying in 1m0s (4.716637901s/720h0m0s elapsed)...
Apr 13 11:54:43 caddy[21625]: 2020/04/13 11:54:43 [ERROR] attempt 1: [*.mark.caddy.test.shinenelson.xyz] Obtain: [*.mark.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:54:43 caddy[21625]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:43 caddy[21625]:  - retrying in 1m0s (4.916072206s/720h0m0s elapsed)...
Apr 13 11:54:43 caddy[21625]: 2020/04/13 11:54:43 [ERROR] attempt 1: [*.varghese.caddy.test.shinenelson.xyz] Obtain: [*.varghese.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:54:43 caddy[21625]: [*.varghese.caddy.test.shinenelson.xyz] [*.varghese.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:43 caddy[21625]:  - retrying in 1m0s (5.341919847s/720h0m0s elapsed)...
Apr 13 11:54:43 caddy[21625]: 2020/04/13 11:54:43 [ERROR] attempt 1: [*.sirajul.caddy.test.shinenelson.xyz] Obtain: [*.sirajul.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:54:43 caddy[21625]: [*.sirajul.caddy.test.shinenelson.xyz] [*.sirajul.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:54:43 caddy[21625]:  - retrying in 1m0s (5.413257214s/720h0m0s elapsed)...
Apr 13 11:55:43 caddy[21625]: 2020/04/13 11:55:43 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 11:55:43 caddy[21625]: 2020/04/13 11:55:43 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 11:55:43 caddy[21625]: 2020/04/13 11:55:43 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043703
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043706
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [WARN] [*.shine.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.shine.caddy.test.shinenelson.xyz.'
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043707
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [WARN] [*.lola.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.lola.caddy.test.shinenelson.xyz.'
Apr 13 11:55:44 caddy[21625]: 2020/04/13 11:55:44 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043703
Apr 13 11:55:45 caddy[21625]: 2020/04/13 11:55:45 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043706
Apr 13 11:55:45 caddy[21625]: 2020/04/13 11:55:45 [ERROR] acme: Error -> One or more domains had a problem:
Apr 13 11:55:45 caddy[21625]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:55:45 caddy[21625]:  (challenge=dns-01 remaining=[])
Apr 13 11:55:45 caddy[21625]: 2020/04/13 11:55:45 [ERROR] acme: Error -> One or more domains had a problem:
Apr 13 11:55:45 caddy[21625]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:55:45 caddy[21625]:  (challenge=dns-01 remaining=[])
Apr 13 11:55:45 caddy[21625]: 2020/04/13 11:55:45 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 11:55:45 caddy[21625]: 2020/04/13 11:55:45 [WARN] [*.mark.caddy.test.shinenelson.xyz] acme: cleaning up failed: digitalocean: unknown record ID for '_acme-challenge.mark.caddy.test.shinenelson.xyz.'
Apr 13 11:55:45 caddy[21625]: 2020/04/13 11:55:45 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043707
Apr 13 11:55:46 caddy[21625]: 2020/04/13 11:55:46 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043709
Apr 13 11:55:46 caddy[21625]: 2020/04/13 11:55:46 [ERROR] acme: Error -> One or more domains had a problem:
Apr 13 11:55:46 caddy[21625]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:55:46 caddy[21625]:  (challenge=dns-01 remaining=[])
Apr 13 11:55:46 caddy[21625]: 2020/04/13 11:55:46 [INFO] Deactivating auth: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49043710
Apr 13 11:55:47 caddy[21625]: 2020/04/13 11:55:47 [ERROR] attempt 2: [*.shine.caddy.test.shinenelson.xyz] Obtain: [*.shine.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:55:47 caddy[21625]: [*.shine.caddy.test.shinenelson.xyz] [*.shine.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:55:47 caddy[21625]:  - retrying in 2m0s (1m8.828179171s/720h0m0s elapsed)...
Apr 13 11:55:47 caddy[21625]: 2020/04/13 11:55:47 [ERROR] attempt 2: [*.lola.caddy.test.shinenelson.xyz] Obtain: [*.lola.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:55:47 caddy[21625]: [*.lola.caddy.test.shinenelson.xyz] [*.lola.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:55:47 caddy[21625]:  - retrying in 2m0s (1m9.021657718s/720h0m0s elapsed)...
Apr 13 11:55:48 caddy[21625]: 2020/04/13 11:55:48 [ERROR] attempt 2: [*.mark.caddy.test.shinenelson.xyz] Obtain: [*.mark.caddy.test.shinenelson.xyz] acme: Error -> One or more domains had a problem:
Apr 13 11:55:48 caddy[21625]: [*.mark.caddy.test.shinenelson.xyz] [*.mark.caddy.test.shinenelson.xyz] acme: error presenting token: digitalocean: HTTP 401: unauthorized: Unable to authenticate you.
Apr 13 11:55:48 caddy[21625]:  - retrying in 2m0s (1m9.868857611s/720h0m0s elapsed)...

This is expected since the digitalocean_auth_token is not an actual auth_token.

  1. PATCH existing configuration with proper actual auth_token
    $ curl -X PATCH -H 'Content-Type: application/json' -d '{ "@id": "dns-challenge", "provider": "digitalocean", "auth_token": "actualdigitaloceanauthtoken" }' localhost:2019/id/dns-challenge
Log
Apr 13 11:56:02 caddy[21625]: {"level":"info","ts":1586778962.5778215,"logger":"admin.api","msg":"received request","method":"PATCH","host":"localhost:2019","uri":"/id/dns-challenge","remote_addr":"127.0.0.1:50872","headers":{"Accept":["*/*"],"Content-Length":["139"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Apr 13 11:56:02 caddy[21625]: {"level":"info","ts":1586778962.57843,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
Apr 13 11:56:02 caddy[21625]: {"level":"info","ts":1586778962.578948,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.lola.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.shine.caddy.test.shinenelson.xyz"]}
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [INFO][cache:0xc000674820] Stopped certificate maintenance routine
Apr 13 11:56:02 caddy[21625]: {"level":"info","ts":1586778962.57927,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Apr 13 11:56:02 caddy[21625]: {"level":"info","ts":1586778962.5793817,"logger":"admin","msg":"stopped previous server"}
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [INFO][cache:0xc0006ec960] Started certificate maintenance routine
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [ERROR] *.mark.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [ERROR] *.lola.caddy.test.shinenelson.xyz: obtaining certificate: context canceled
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Apr 13 11:56:02 caddy[21625]: 2020/04/13 11:56:02 [ERROR] *.shine.caddy.test.shinenelson.xyz: obtaining certificate: context canceled

This is where the issue is. Ideally, the new certificate maintenance routine should start a new obtaining certificate context. Instead, it just stays there.
The last time, I waited only for 1 minute. This time, I waited longer - around 5 minutes.

  1. Reload the server by PATCHing some other @id
    $ curl -X PATCH -H 'Content-Type: application/json' -d '{ "@id": "proxy_mark", "handler": "reverse_proxy", "upstreams": [ { "dial": "localhost" } ] }' localhost:2019/id/proxy_mark
Log
Apr 13 12:01:49 caddy[21625]: {"level":"info","ts":1586779309.4472044,"logger":"admin.api","msg":"received request","method":"POST","host":"localhost:2019","uri":"/id/proxy_mark","remote_addr":"127.0.0.1:50874","headers":{"Accept":["*/*"],"Content-Length":["94"],"Content-Type":["application/json"],"User-Agent":["curl/7.58.0"]}}
Apr 13 12:01:49 caddy[21625]: {"level":"info","ts":1586779309.4477863,"logger":"admin","msg":"admin endpoint started","address":"tcp/localhost:2019","enforce_origin":false,"origins":["localhost:2019","[::1]:2019","127.0.0.1:2019"]}
Apr 13 12:01:49 caddy[21625]: {"level":"info","ts":1586779309.4483356,"logger":"http","msg":"enabling automatic TLS certificate management","domains":["*.lola.caddy.test.shinenelson.xyz","*.mark.caddy.test.shinenelson.xyz","*.shine.caddy.test.shinenelson.xyz"]}
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][cache:0xc0006ec960] Stopped certificate maintenance routine
Apr 13 12:01:49 caddy[21625]: {"level":"info","ts":1586779309.448677,"msg":"autosaved config","file":"/var/lib/caddy/.config/caddy/autosave.json"}
Apr 13 12:01:49 caddy[21625]: {"level":"info","ts":1586779309.4487915,"logger":"admin","msg":"stopped previous server"}
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][cache:0xc000674730] Started certificate maintenance routine
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.lola.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.lola.caddy.test.shinenelson.xyz] Done waiting
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.mark.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain certificate; acquiring lock...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Lock acquired; proceeding...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.shine.caddy.test.shinenelson.xyz] Waiting on rate limiter...
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.shine.caddy.test.shinenelson.xyz] Done waiting
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO][*.mark.caddy.test.shinenelson.xyz] Done waiting
Apr 13 12:01:49 caddy[21625]: 2020/04/13 12:01:49 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Obtaining bundled SAN certificate given a CSR
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.mark.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49044935
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.shine.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49044936
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.lola.caddy.test.shinenelson.xyz] AuthURL: https://acme-staging-v02.api.letsencrypt.org/acme/authz-v3/49044937
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: use dns-01 solver
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Preparing to solve DNS-01
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Trying to solve DNS-01
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Checking DNS record propagation using [127.0.0.53:53]
Apr 13 12:01:51 caddy[21625]: 2020/04/13 12:01:51 [INFO] Wait for propagation [timeout: 1m0s, interval: 5s]
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.shine.caddy.test.shinenelson.xyz] The server validated our request
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.mark.caddy.test.shinenelson.xyz] The server validated our request
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.lola.caddy.test.shinenelson.xyz] The server validated our request
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Cleaning DNS-01 challenge
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.shine.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.mark.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Apr 13 12:01:53 caddy[21625]: 2020/04/13 12:01:53 [INFO] [*.lola.caddy.test.shinenelson.xyz] acme: Validations succeeded; requesting certificates
Apr 13 12:01:54 caddy[21625]: 2020/04/13 12:01:54 [INFO] [*.lola.caddy.test.shinenelson.xyz] Server responded with a certificate.
Apr 13 12:01:54 caddy[21625]: 2020/04/13 12:01:54 [INFO][*.lola.caddy.test.shinenelson.xyz] Certificate obtained successfully
Apr 13 12:01:54 caddy[21625]: 2020/04/13 12:01:54 [INFO][*.lola.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Apr 13 12:01:54 caddy[21625]: 2020/04/13 12:01:54 [INFO] [*.shine.caddy.test.shinenelson.xyz] Server responded with a certificate.
Apr 13 12:01:54 caddy[21625]: 2020/04/13 12:01:54 [INFO][*.shine.caddy.test.shinenelson.xyz] Certificate obtained successfully
Apr 13 12:01:54 caddy[21625]: 2020/04/13 12:01:54 [INFO][*.shine.caddy.test.shinenelson.xyz] Obtain: Releasing lock
Apr 13 12:01:55 caddy[21625]: 2020/04/13 12:01:55 [INFO] [*.mark.caddy.test.shinenelson.xyz] Server responded with a certificate.
Apr 13 12:01:55 caddy[21625]: 2020/04/13 12:01:55 [INFO][*.mark.caddy.test.shinenelson.xyz] Certificate obtained successfully
Apr 13 12:01:55 caddy[21625]: 2020/04/13 12:01:55 [INFO][*.mark.caddy.test.shinenelson.xyz] Obtain: Releasing lock

I'm not sure if this is intended behaviour, but I'd expect the above to happen right after the previous obtain certificate context is canceled in Step 2.


Sidenotes ( from previous comment )

PS : systemctl reload caddy would still not reload from the autosave.json in the default caddy.service unless the --config flag is removed.

Hmm, that doesn't sound right... how can I reproduce this? Having troubles seeing that.

This was because the --resume flag was missing in the service file and I had missed to look at the documentation for all parameters of the command line. I took the service file for granted.
The new README.md#L52 explains it clearly, but that file came only 5 days ago 🙈 I haven't tested with the flag yet but I'm guessing the flag was the culprit after all.

@francislavoie
Copy link
Member

francislavoie commented Apr 13, 2020

I know I messed up the issue template the last time ( unfortunately, it somehow doesn't exist in the repository or come up while creating a new issue.

Ah, interesting. I think the template is on the v1 branch, we'll need to add it to the new v2/master branch now that we migrated the default branch. Thanks for pointing that out

@mholt
Copy link
Member

mholt commented Apr 13, 2020

@shinenelson Thanks! That's much easier for me to follow -- I assume no log lines have been omitted.

By this:

This is where the issue is. Ideally, the new certificate maintenance routine should start a new obtaining certificate context. Instead, it just stays there.

What do you mean by "it just stays there" - and, separate question: what exactly is not working?

Obtain certificate; acquiring lock

That appears to be happening.

Edit: @francislavoie I think, maybe he is referring to the revised unit file that we worked on last week?

@francislavoie
Copy link
Member

Edit: @francislavoie I think, maybe he is referring to the revised unit file that we worked on last week?

Yes, he was using caddy.service instead of caddy-api.service I think?

@mholt
Copy link
Member

mholt commented Apr 13, 2020

Yes, I think so; caddy-api.service didn't exist at the time.

Edit: To clarify, @shinenelson, I'm still not quite sure that anything is wrong; I am trying to figure out what the disconnect is between what you expect and what reality is.

Did the wrong command / missing CLI flag resolve the problem then?

@shinenelson
Copy link
Author

I mentioned the service file again only because I mentioned it earlier ( to reply to a previous question for clarification ). This time around, I took that out of the equation while testing as requested. The service file does not have anything to do with the issue.


I assume no log lines have been omitted.

Other than a few GET requests to the /config/ endpoint to see what the running configuration was, no.


What do you mean by "it just stays there"

Obtain certificate; acquiring lock

That appears to be happening.

That acquiring lock happens if I trigger another reload of the server. If I didn't reload, the new context would probably never start. Let me try and put it in text for you :

  1. Start caddy server with initial running configuration ( I'm calling it rc1 ) > caddy server starts initial certificate maintenance routine ( cmr1 ) > obtain certificate context created ( occ1 ) > occ1 fails due to wrong auth_token > goes into retry mode.
  2. Update caddy server running configuration via API with actual auth_token > New configuration ( rc2 ) overwrites existing configuration ( rc1 ); i.e., rc2 overwrites rc1 > server reloads with rc2 ( occ1 is still in retry mode ) > cmr1 is stopped; new certificate maintenance routine starts ( cmr2 ) > sees occ1 and kills it ( obtain certificate: context canceled ) > nothing happens after this; until a new server reload is triggered.

separate question: what exactly is not working?

At this point, shouldn't there be another new obtain certificate context ( occ2 ) so that it retries obtaining the certificates again?

That doesn't seem to be happening which is what I'm pointing as the problem. That's what I meant by "it just stays there"

I do the next step only to test whether occ2 gets triggered ( and my certificates are generated ) but my workflow should ideally end here.

  1. Reload caddy server by updating running configuration via API with some other @id > new configuration rc3 overwrites rc2 > server reloads with rc3 ( there is no obtain certificate context any more because it was killed by cmr2 ) > cmr2 is stopped; new certificate maintenance routine starts ( cmr3 ) > finds that there are hosts without generated certificates and starts a new obtain certificate context ( occ2 ) > generates new certificates for domains ( since the correct auth_token has been available in both rc2 and rc3 ) > PASS!

My expectation was that occ2 should've triggered in Step 2. Instead, I had to reload the server ( Step 3 ) for it to get triggered.

@shinenelson
Copy link
Author

To summarize even further :

Step 1 : obtain certificate context fails due to lack of correct auth_token > goes into retry mode.
Step 2 : new certificate maintenance routine comes along and kills old obtain certificate context ( from Step 1 ), but does not start a new one, though it should, ideally.
Step 3 : reload server > new certificate maintenance routine triggers new obtain certificate context ( since there aren't any existing contexts ) > generates new certificates > PASS!

Ideally, my workflow should have ended in Step 2, but I didn't have any certificates for my hosts at that point; so, I had to go to step 3.

So, basically, what happens is that if a new certificate maintenance routine sees an old / existing obtain certificate context, it'll kill the old one, but does not start a new one, though it should, ideally. However, if there are no existing obtain certificate contexts and there are new hosts to obtain certificates; it'll happily go ahead and create new obtain certificate contexts

@mholt mholt reopened this Apr 14, 2020
@mholt
Copy link
Member

mholt commented Apr 14, 2020

Interesting... I think I see what you're talking about now. I have a hunch, I'll need more time to mull it over though and try some things.

If I'm right, the good news is that we're not leaking resources -- the bad news is that we're cleaning up too many resources. Either way, as you've noted, a practical workaround in the meantime is to trigger another config reload. You can do this by POSTing your config to the /load endpoint with Cache-Control: must-revalidate in the request header (which I just realized I forgot to document).

@mholt
Copy link
Member

mholt commented Apr 14, 2020

I was able to reproduce this. My hunch was correct.

I'll explain more in a follow-up. @shinenelson if I open a PR in the next few minutes, would you be able to test the CI artifact today? If so, it's a very minor change so it should be able to go out with 2.0.

@shinenelson
Copy link
Author

sure; it'll take me a bit to get my test instance ready. Let me go and get that instance up by the time you merge that change in.

mholt added a commit to caddyserver/certmagic that referenced this issue Apr 14, 2020
@mholt mholt closed this as completed in 2609a72 Apr 14, 2020
@mholt
Copy link
Member

mholt commented Apr 14, 2020

@shinenelson Okay, I just pushed it to master since it's just an update of go.mod. Fixed in 2609a72 -- I verified it repeatedly on my machine, please confirm when you can!

@mholt mholt added bug 🐞 Something isn't working and removed needs info 📭 Requires more information labels Apr 14, 2020
@mholt mholt added this to the v2.0.0 milestone Apr 14, 2020
@shinenelson
Copy link
Author

On it.

Side note : I don't know why I come up with issues that should be reported at caddyserver/certmagic and report them here. I'll try to remember that the certmagic repository also exists next time I have an issue with TLS certificates.

@mholt
Copy link
Member

mholt commented Apr 14, 2020

It's fine, this was a weird case that is caused by the interplay between the two. CertMagic would have been a more confusing place to report this bug, but the fix was much much easier in CertMagic.

@shinenelson
Copy link
Author

All good! I too tested it multiple times. I broke the DNS auth a couple of times to make sure it would still generate the new certificates. This is good to go 🚀

@mholt
Copy link
Member

mholt commented Apr 14, 2020

Great to hear it!

Here's what was happening:

When a Caddy config is loaded and started, it initializes all the apps, like the tls app which manages certificates. If a new config fails to load completely successfully, it is stopped and the old one keeps running. If it succeeds, it stops the old one instead. For a brief period, both configs are active. (Explained here)

Under the hood, CertMagic works on "obtain" and "renew" jobs asynchronously (in the background). When Caddy asks to obtain certificates, CertMagic creates a series of jobs, one for each name. Each job had a unique name/ID so that jobs wouldn't be duplicated. In other words, you don't need 3 jobs that all attempt to renew the same certificate. If a job with the same name was already queued, CertMagic would ignore the new job so as to not duplicate it and clog up the queue.

The problem is that Caddy calls ManageAsync() before shutting down the previous config. That's by design, but it was problematic because CertMagic would see the old config's "obtain" jobs of the same name and discard the new ones, not knowing that the old ones were about to be canceled! Hence, the second config load only canceled the obtain job from the first config, and did not actually start the new ones.

The change I made removes job de-duplication for "obtain" jobs -- i.e. gives them an empty name. This is because we now assume that whoever is calling ManageAsync() will not do so excessively, and/or they will cancel the context of any prior calls.

In other words, the fix is to allow multiple/overlapping "obtain" jobs, so for a brief time, yes there are two jobs that try to obtain the same certificate, but they are synced by a lock mechanism, and one of them is canceled quickly anyway.

So, I think this is a nice and simple solution to a tricky and obscure problem.

Thanks for your patience and diligence in getting it sorted out!

@shinenelson
Copy link
Author

That certainly was a weird and obscure problem. The quirks of having inter-connecting applications doing asynchronous calls - multiple points of failures.

I love the way you handled the fix. Like you said - it's a simple fix to a kind of complicated problem.

Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants