Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add water fix #29

Merged
merged 11 commits into from
Nov 11, 2019
Merged

Add water fix #29

merged 11 commits into from
Nov 11, 2019

Conversation

OnGle
Copy link
Member

@OnGle OnGle commented Nov 6, 2019

A rather large fix to turnkeylinux/tracker#1360

Essentially the issue was as follows

  • dehydrated starts multiple instances of add-water (via dehydrated hook script)
  • add-water writes it's pid file, overwriting any other currently running add-water instances
  • last instance rarely has correct pid file so dehydrated-wrapper / hook script cannot kill it

Fix:

  • seperated add-water to server & client components
  • server component only runs once and does so in background
  • client can run many times and can pass tokens to be served to server component
  • client component is used by dehydrated hook script to serve multiple tokens with a single "http server"

OnGle added 10 commits November 4, 2019 22:46
- ensure only a single instance of add-water is serving http at any
  time but allow new tokens to be passed to it via a socket.

  - add-water-srv hosts any challenge handed to it via it's local
    socket (see add-water.socket)

  - add-water-client essentially just passes tokens to add-water-srv

- should fix #1360 (where multiple instances of add-water interfered
  with each other)
@JedMeister
Copy link
Member

Hey @OnGle - This looks great. However, when testing (with 5 domains) it's failing. Here's the log:

[2019-11-10 21:02:57] dehydrated-wrapper: WARNING: /etc/dehydrated/confconsole.config not found; copying default from /usr/share/confconsole/letsencrypt/dehydrated-confconsole.config
[2019-11-10 21:02:58] dehydrated-wrapper: WARNING: /etc/dehydrated/confconsole.hook.sh not found; copying default from /usr/share/confconsole/letsencrypt/dehydrated-confconsole.hook.sh
[2019-11-10 21:02:58] dehydrated-wrapper: WARNING: /etc/cron.daily/confconsole-dehydrated not found; copying default from /usr/share/confconsole/letsencrypt/dehydrated-confconsole.cron
ERROR: Challenge is invalid! (returned: invalid) (result: {
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:ietf:params:acme:error:unauthorized",
    "detail": "Invalid response from http://le-test01.jeremydavis.org/.well-known/acme-challenge/BfOaXsKWCTT3NUv6FeTCfuoj86J77qniqC8hU7CnIJo [13.211.163.42]: \"\\n    \u003c!DOCTYPE HTML PUBLIC \\\"-//IETF//DTD HTML 2.0//EN\\\"\u003e\\n    \u003chtml\u003e\\n        \u003chead\u003e\\n            \u003ctitle\u003eError: 500 Internal Server \"",
    "status": 403
  },
  "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/1178586201/btlBuA",
  "token": "BfOaXsKWCTT3NUv6FeTCfuoj86J77qniqC8hU7CnIJo",
  "validationRecord": [
    {
      "url": "http://le-test01.jeremydavis.org/.well-known/acme-challenge/BfOaXsKWCTT3NUv6FeTCfuoj86J77qniqC8hU7CnIJo",
      "hostname": "le-test01.jeremydavis.org",
      "port": "80",
      "addressesResolved": [
        "13.211.163.42"
      ],
      "addressUsed": "13.211.163.42"
    }
  ]
})
[2019-11-10 21:03:20] dehydrated-wrapper: FATAL: dehydrated exited with a non-zero exit code.
[2019-11-10 21:03:20] dehydrated-wrapper: WARNING: Python is still listening on port 80
[2019-11-10 21:03:20] dehydrated-wrapper: WARNING: Something went wrong, restoring original cert & key.
[2019-11-10 21:03:21] dehydrated-wrapper: WARNING: Check today's previous log entries for details of error.

Even though it fails, add-water is killed successfully, which is progress, but Apache wasn't restarted afterwards (well it was, but it failed). Here's the Apache status post running the confconsole dehydrated wrapper:

* apache2.service - The Apache HTTP Server
   Loaded: loaded (/lib/systemd/system/apache2.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Sun 2019-11-10 21:03:20 UTC; 1min 10s ago
  Process: 3202 ExecStop=/usr/sbin/apachectl stop (code=exited, status=0/SUCCESS)
  Process: 4126 ExecStart=/usr/sbin/apachectl start (code=exited, status=1/FAILURE)
 Main PID: 810 (code=exited, status=0/SUCCESS)

Nov 10 21:03:20 lamp apachectl[4126]: (98)Address already in use: AH00072: make_sock: could not bind to address [::]:80
Nov 10 21:03:20 lamp apachectl[4126]: (98)Address already in use: AH00072: make_sock: could not bind to address 0.0.0.0:80
Nov 10 21:03:20 lamp apachectl[4126]: no listening sockets available, shutting down
Nov 10 21:03:20 lamp apachectl[4126]: AH00015: Unable to open logs
Nov 10 21:03:20 lamp apachectl[4126]: Action 'start' failed.
Nov 10 21:03:20 lamp apachectl[4126]: The Apache error log may have more information.
Nov 10 21:03:20 lamp systemd[1]: apache2.service: Control process exited, code=exited status=1
Nov 10 21:03:20 lamp systemd[1]: Failed to start The Apache HTTP Server.
Nov 10 21:03:20 lamp systemd[1]: apache2.service: Unit entered failed state.
Nov 10 21:03:20 lamp systemd[1]: apache2.service: Failed with result 'exit-code'.

It restarts ok when I manually start it though (because at least add-water is killed successfully).

According to the log, it looks like add-water might be giving a 500 error. Although the weird thing is that when I retried it, it appears to succeed with the first domain, but then fails in the same way on the second:

[2019-11-10 21:17:05] dehydrated-wrapper: INFO: started
[2019-11-10 21:17:07] dehydrated-wrapper: INFO: found apache2 listening on port 80
[2019-11-10 21:17:07] dehydrated-wrapper: INFO: stopping apache2
[2019-11-10 21:17:07] dehydrated-wrapper: INFO: running dehydrated
ERROR: Challenge is invalid! (returned: invalid) (result: {
  "type": "http-01",
  "status": "invalid",
  "error": {
    "type": "urn:ietf:params:acme:error:unauthorized",
    "detail": "Invalid response from http://le-test02.jeremydavis.org/.well-known/acme-challenge/TodqOOhGkWRBDU8tAw1cc-oSuO5lfMLtoUgnLc_6m2U [13.211.163.42]: \"\\n    \u003c!DOCTYPE HTML PUBLIC \\\"-//IETF//DTD HTML 2.0//EN\\\"\u003e\\n    \u003chtml\u003e\\n        \u003chead\u003e\\n            \u003ctitle\u003eError: 500 Internal Server \"",
    "status": 403
  },
  "url": "https://acme-v02.api.letsencrypt.org/acme/chall-v3/1178586203/lBihbw",
  "token": "TodqOOhGkWRBDU8tAw1cc-oSuO5lfMLtoUgnLc_6m2U",
  "validationRecord": [
    {
      "url": "http://le-test02.jeremydavis.org/.well-known/acme-challenge/TodqOOhGkWRBDU8tAw1cc-oSuO5lfMLtoUgnLc_6m2U",
      "hostname": "le-test02.jeremydavis.org",
      "port": "80",
      "addressesResolved": [
        "13.211.163.42"
      ],
      "addressUsed": "13.211.163.42"
    }
  ]
})
[2019-11-10 21:17:31] dehydrated-wrapper: FATAL: dehydrated exited with a non-zero exit code.
[2019-11-10 21:17:31] dehydrated-wrapper: WARNING: Python is still listening on port 80
[2019-11-10 21:17:31] dehydrated-wrapper: INFO: attempting to kill add-water server
[2019-11-10 21:17:31] dehydrated-wrapper: WARNING: Something went wrong, restoring original cert & key.
[2019-11-10 21:17:31] dehydrated-wrapper: INFO: starting apache2
Job for apache2.service failed because the control process exited with error code.
See "systemctl status apache2.service" and "journalctl -xe" for details.
[2019-11-10 21:17:31] dehydrated-wrapper: INFO: starting stunnel4
[2019-11-10 21:17:31] dehydrated-wrapper: WARNING: Check today's previous log entries for details of error.

Again Apache fails to start (I assume because add-water is still using port 80).

So it seems to me that there are still two issues:

  1. When add-water serves the challenges it's either not serving it quite right, or perhaps not fast enough (or too fast?).
  2. When it tries to restart the webserver, add-water still hasn't released port 80 yet.

Also FWIW, I've added the CA and CA_TERMS to the config file. TBH, I'm not sure that they are required, but added them for good measure. Note in the above output, I hadn't added them in the first run and yet it used the v2 API endpoint (suggesting that at least CA isn't required), but then I did add them prior to the 2nd (in an effort to be sure everything was in place).

@OnGle
Copy link
Member Author

OnGle commented Nov 11, 2019

Hmk, cheers for all the tests. I'll look into this today.

@OnGle
Copy link
Member Author

OnGle commented Nov 11, 2019

Issue regarding not serving challenges correctly was just a missing import. That portion should now work.

@JedMeister
Copy link
Member

JedMeister commented Nov 11, 2019

Awesome thanks. I'm pretty sure that the issue with Apache not being restarted was because the dehydrated-wrapper was still trying to stop add-water via the pid file (in the clean_finish() function). I've fixed that (I think - it's untested) and have rebuilt the package (including your fix) and uploaded to the test server.

@OnGle
Copy link
Member Author

OnGle commented Nov 11, 2019

Alright I tested that package and I'm pretty sure we're good to go. No errors of any kind.

Clean test (without apache) - success
Test with already registered domains (without apache) - success
Test without registering at all (without apache) - success
Clean test (with apache) - success (apache restarted)
Test with already registered domains (with apache) - success (apache restarted)
Test without registering at all (with apache) - success (apache restarted)

@JedMeister
Copy link
Member

Sweet! I'll merge that all in now and we should be good to go for the add-water/Let's Encrypt fixed Confconsole package - yay! 🎉

Then I'll branch the code out to a 15.x branch and merge #28 into master. Getting close... 😄

@JedMeister JedMeister merged commit e1b249d into turnkeylinux:master Nov 11, 2019
@JedMeister JedMeister mentioned this pull request Nov 11, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants