Skip to content
This repository has been archived by the owner on Dec 31, 2022. It is now read-only.

SSL Error With Nginx Key Download #268

Closed
l33z3r opened this issue Oct 1, 2021 · 14 comments
Closed

SSL Error With Nginx Key Download #268

l33z3r opened this issue Oct 1, 2021 · 14 comments

Comments

@l33z3r
Copy link

l33z3r commented Oct 1, 2021

Hi guys, I'm seeing a new error when starting a new server with opsworks_ruby.

Seems the recipe is having trouble connecting to the nginx.org site in order to download the key:

[2021-10-01T11:53:51+00:00] ERROR: SSL Validation failure connecting to host: nginx.org - SSL_connect returned=1 errno=0 state=error: certificate verify failed

Is anybody seeing this behaviour?

@willkoehler
Copy link

willkoehler commented Oct 1, 2021

TLDR; @jamesbjackson has the right approach #268 (comment)

Put this recipe at the top of your Setup (for Ubuntu 18.04 not sure if it will work in all distros)

remote_file "Copy more recent root certificate into Chef" do
  path "/opt/chef/embedded/ssl/certs/cacert.pem"
  source "file:///etc/ssl/certs/ca-certificates.crt"
  owner 'root'
  group 'root'
  mode 0644
end

Previously I wrote:

I'm seeing this too and was about to open my own issue.

The problem is the SSL cert used by nginx.org was updated in a way that's not compatible with OpenSSL in Ruby 2.3.6, which is embedded in Chef 12.

You can reproduce this error and find the root cause like this:

ubuntu@rails1:~$ cd /opt/chef/embedded/bin
ubuntu@rails1:/opt/chef/embedded/bin$ ./irb
irb(main):001:0> require 'net/http'
=> true
irb(main):002:0> Net::HTTP.get(URI.parse("https://nginx.org/keys/nginx_signing.key"))
OpenSSL::SSL::SSLError: SSL_connect returned=1 errno=0 state=error: certificate verify failed
	from /opt/chef/embedded/lib/ruby/2.3.0/net/protocol.rb:44:in `connect_nonblock'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/protocol.rb:44:in `ssl_socket_connect'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/http.rb:928:in `connect'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/http.rb:863:in `do_start'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/http.rb:852:in `start'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/http.rb:584:in `start'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/http.rb:479:in `get_response'
	from /opt/chef/embedded/lib/ruby/2.3.0/net/http.rb:456:in `get'
	from (irb):2
	from ./irb:11:in `<main>'
irb(main):003:0>

Note this works fine in Ruby 2.7.4 It also works if you run curl -v https://nginx.org/keys/nginx_signing.key on the command line. It's only a problem in the unsupported version of Ruby embedded in Chef 12 🤦‍♂️

The work-around I found is to set the SSL_CERT_FILE environment variable to a good root certificate file (for example /etc/ssl/certs/ca-certificates.crt on Ubuntu 18.04).

Example:

ubuntu@rails1:/opt/chef/embedded/bin$ ./irb
irb(main):001:0> ENV['SSL_CERT_FILE']="/etc/ssl/certs/ca-certificates.crt"
=> "/etc/ssl/certs/ca-certificates.crt"
irb(main):002:0> require 'net/http'
=> true
irb(main):003:0> Net::HTTP.get(URI.parse("https://nginx.org/keys/nginx_signing.key"))
=> "-----BEGIN PGP PUBLIC KEY BLOCK-----\nVersion: GnuPG v2.0.22 (GNU/Linux)\n\nmQENBE5OMmIBCAD+FPYKGriGGf7NqwKfWC83cBV01gabgVWQmZbMcFzeW+hMsgxH\nW6iimD0RsfZ9oEbfJCPG0CRSZ7ppq5pKamYs2+EJ8Q2ysOFHHwpGrA2C8zyNAs4I\nQxnZZIbETgcSwFtDun0XiqPwPZgyuXVm9PAbLZRbfBzm8wR/3SWygqZBBLdQk5TE\nfDR+Eny/M1RVR4xClECONF9UBB2ejFdI1LD45APbP2hsN/piFByU1t7yK2gpFyRt\n97WzGHn9MV5/TL7AmRPM4pcr3JacmtCnxXeCZ8nLqedoSuHFuhwyDnlAbu8I16O5\nXRrfzhrHRJFM1JnIiGmzZi6zBvH0ItfyX6ttABEBAAG0KW5naW54IHNpZ25pbmcg\na2V5IDxzaWduaW5nLWtleUBuZ2lueC5jb20+iQE+BBMBAgAoAhsDBgsJCAcDAgYV\nCAIJCgsEFgIDAQIeAQIXgAUCV2K1+AUJGB4fQQAKCRCr9b2Ce9m/YloaB/9XGrol\nkocm7l/tsVjaBQCteXKuwsm4XhCuAQ6YAwA1L1UheGOG/aa2xJvrXE8X32tgcTjr\nKoYoXWcdxaFjlXGTt6jV85qRguUzvMOxxSEM2Dn115etN9piPl0Zz+4rkx8+2vJG\nF+eMlruPXg/zd88NvyLq5gGHEsFRBMVufYmHtNfcp4okC1klWiRIRSdp4QY1wdrN\n1O+/oCTl8Bzy6hcHjLIq3aoumcLxMjtBoclc/5OTioLDwSDfVx7rWyfRhcBzVbwD\noe/PD08AoAA6fxXvWjSxy+dGhEaXoTHjkCbz/l6NxrK3JFyauDgU4K4MytsZ1HDi\nMgMW8hZXxszoICTTiQEcBBABAgAGBQJOTkelAAoJEKZP1bF62zmo79oH/1XDb29S\nYtWp+MTJTPFEwlWRiyRuDXy3wBd/BpwBRIWfWzMs1gnCjNjk0EVBVGa2grvy9Jtx\nJKMd6l/PWXVucSt+U/+GO8rBkw14SdhqxaS2l14v6gyMeUrSbY3XfToGfwHC4sa/\nThn8X4jFaQ2XN5dAIzJGU1s5JA0tjEzUwCnmrKmyMlXZaoQVrmORGjCuH0I0aAFk\nRS0UtnB9HPpxhGVbs24xXZQnZDNbUQeulFxS4uP3OLDBAeCHl+v4t/uotIad8v6J\nSO93vc1evIje6lguE81HHmJn9noxPItvOvSMb2yPsE8mH4cJHRTFNSEhPW6ghmlf\nWa9ZwiVX5igxcvaIRgQQEQIABgUCTk5b0gAKCRDs8OkLLBcgg1G+AKCnacLb/+W6\ncflirUIExgZdUJqoogCeNPVwXiHEIVqithAM1pdY/gcaQZmIRgQQEQIABgUCTk5f\nYQAKCRCpN2E5pSTFPnNWAJ9gUozyiS+9jf2rJvqmJSeWuCgVRwCcCUFhXRCpQO2Y\nVa3l3WuB+rgKjsQ=\n=EWWI\n-----END PGP PUBLIC KEY BLOCK-----\n"
irb(main):004:0>

🥳🥳🥳 WORKS!

My "very monkey-patched midnight crisis mode" fix to get production running last night was to edit /opt/chef/embedded/lib/ruby/gems/2.3.0/gems/chef-12.22.5/lib/chef/http.rb and add ENV['SSL_CERT_FILE']="/etc/ssl/certs/ca-certificates.crt" after the requires

require "tempfile"
require "net/https"
require "uri"
require "chef/http/basic_client"
require "chef/monkey_patches/net_http"
require "chef/config"
require "chef/platform/query_helpers"
require "chef/exceptions"

ENV['SSL_CERT_FILE']="/etc/ssl/certs/ca-certificates.crt" # <=== ADD THIS

class Chef

  # == Chef::HTTP
  # Basic HTTP client, with support for adding features via middleware
  class HTTP
. 
.
.

I'm working on a better fix and I'll update shortly. In the meantime, I wanted to get this info out there.

Editorial comment: TBH this feels like the final straw for me on OpsWorks. My (admittedly raw) feelings last night https://twitter.com/wckoehler/status/1443713784670003205

EDIT: Additional context. OpsWorks has been a great tool and I'm super grateful for all the effort people like @ajgon have put into keeping it going. I have some use cases that will be hard to deploy any other way (like a system that processes large videos with ffmpeg). But the current situation with OpsWorks feels untenable to me. There are too many things out of my control.

@olbrich
Copy link
Contributor

olbrich commented Oct 1, 2021

This is related to chef/chef#12126. TL;DR... the Let's encrypt root cert expired yesterday and it looks like all/many versions of Chef aren't handling it well.

@cschulte22
Copy link

You can fix this by setting the ENV in the nginx repo.rb recipe. In our case we've got all the dependent cookbooks in our custom chef github repo that we use in the Opsworks stack settings. In nginx/recipes/repo.rb just add this line at the top:

ENV['SSL_CERT_FILE']="/etc/ssl/certs/ca-certificates.crt"

@jamesbjackson
Copy link

This has worked as a clean way on server bring up (Ubuntu 20.04 LTS minimal tested). In case it helps anyone.

apt-get install -y ca-certificates
update-ca-certificates --verbose --fresh
(Install Chef Client)
cp /etc/ssl/certs/ca-certificates.crt /opt/chef/embedded/ssl/certs/cacert.pem

@olbrich
Copy link
Contributor

olbrich commented Oct 1, 2021

I ended up adding

link '/opt/chef/embedded/ssl/certs/cacert.pem' do
  to '/etc/ssl/certs/ca-certificates.crt'
end

As early in my cookbook as I could, and this seems to have fixed stuff for us. YMMV.

@clemblanco
Copy link

clemblanco commented Oct 4, 2021

This can manually be fixed on an already provisioned server by running

sudo apt-get install -y ca-certificates \
    && update-ca-certificates --verbose --fresh \
    && cp /etc/ssl/certs/ca-certificates.crt /opt/chef/embedded/ssl/certs/cacert.pem

And this is what I had to do in my cookbook

# Prevent Chef from using outdated/distrusted CA certificates
# https://github.com/chef/chef/issues/12126
package 'ca-certificates' do
  action :purge
end
package 'ca-certificates' do
  action :install
end
execute 'Updating CA certificates...' do
  command 'update-ca-certificates --verbose --fresh'
end
link '/opt/chef/embedded/ssl/certs/cacert.pem' do
  to '/etc/ssl/certs/ca-certificates.crt'
end

Source: chef/chef#12126

@jhirn
Copy link

jhirn commented Oct 4, 2021

I ended up adding

link '/opt/chef/embedded/ssl/certs/cacert.pem' do
  to '/etc/ssl/certs/ca-certificates.crt'
end

As early in my cookbook as I could, and this seems to have fixed stuff for us. YMMV.

Hello @olbrich! Forgive the silly question. Trying to put out a major inherited fire here and new to Opsworks, very outdated on my Chef skills.

When you say as early in your cookbook as you could, where exactly are you adding that link? Should it be in the ./recipies/[configure|default|deploy|etc..] files? Any or all of them? Thanks so much for the help if you can!

@willkoehler
Copy link

willkoehler commented Oct 4, 2021

Hello @olbrich! Forgive the silly question. Trying to put out a major inherited fire here and new to Opsworks, very outdated on my Chef skills.

When you say as early in your cookbook as you could, where exactly are you adding that link? Should it be in the ./recipies/[configure|default|deploy|etc..] files? Any or all of them? Thanks so much for the help if you can!

@jhirn I created a recipe called chef12_ssl_fix containing

link '/opt/chef/embedded/ssl/certs/cacert.pem' do
  to '/etc/ssl/certs/ca-certificates.crt'
end

and added it as the first step of my Setup (see screenshot). Good luck with the 🔥🤞

Screen Shot 2021-10-04 at 1 47 49 PM

@jhirn
Copy link

jhirn commented Oct 4, 2021

@olbrich 🙇‍♂️

Testing this out now. Will report back but thank you so much this is very helpful!!!

Edit: whoops forgot this @willkoehler 🙇‍♂️ 😉

Update: It worked by simply adding the link to the top of the existing recipes. Not the cleanest but wer're in the proces sof upgrading our process so i'm not too worried about it.

Thx everyone!

@DevOpsInspirisys
Copy link

What need to be added for Windows Instances ?

@ajgon
Copy link
Owner

ajgon commented Oct 8, 2021

Thank you very much, for investigating this. I'll add proper info, to Troubleshoot section of README.

Edit: Or even better, I'll add it to opsworks ruby as configurable option.

@ajgon ajgon closed this as completed Oct 8, 2021
@ajgon ajgon reopened this Oct 8, 2021
ajgon added a commit that referenced this issue Oct 8, 2021
AWS Opsworks uses old, and for long time deprecated version of Chef
(Chef 12). This version includes ROOT CA lists, which are long time
expired - causing most of the scripts to fail. To mimic, a new
configuration param `node['patches']['chef12_ssl_fix']` is introduced,
to include more recent lists.

Fixes #268

BREAKING CHANGE: By default new list of SSL certificates is used.

It should not affect any of your current deployments, but if you start
seeing SSL errors, the first thing you should check, is disabling
`node['patches']['chef12_ssl_fix']` option.

See #268 for more
information.
ajgon added a commit that referenced this issue Oct 8, 2021
AWS Opsworks uses old, and for long time deprecated version of Chef
(Chef 12). This version includes ROOT CA lists, which are long time
expired - causing most of the scripts to fail. To mimic, a new
configuration param `node['patches']['chef12_ssl_fix']` is introduced,
to include more recent lists.

Fixes #268

BREAKING CHANGE: By default new list of SSL certificates is used.

It should not affect any of your current deployments, but if you start
seeing SSL errors, the first thing you should check, is disabling
`node['patches']['chef12_ssl_fix']` option.

See #268 for more
information.
@ajgon ajgon closed this as completed in 4887cf5 Oct 8, 2021
ajgon added a commit that referenced this issue Oct 8, 2021
AWS Opsworks uses old, and for long time deprecated version of Chef
(Chef 12). This version includes ROOT CA lists, which are long time
expired - causing most of the scripts to fail. To mimic, a new
configuration param `node['patches']['chef12_ssl_fix']` is introduced,
to include more recent lists.

Fixes #268

BREAKING CHANGE: By default new list of SSL certificates is used.

It should not affect any of your current deployments, but if you start
seeing SSL errors, the first thing you should check, is disabling
`node['patches']['chef12_ssl_fix']` option.

See #268 for more
information.
@KIVagant
Copy link

KIVagant commented Oct 15, 2021

It is very sad that they have closed the original issue leaving people on their own with the problem. I spend hours and hours trying different combinations of commands, debugging the issue in Ruby, diving into the beautiful world of Ruby and OpenSSL setup in Chef, and, after all:

Here's the working solution for Ubuntu 16.04.7 LTS Xenial and OpsWorks

Disclaimer: all of this won't be really helpful and can be used as a workaround. The only stable solution is to upgrade to one of the latest supported versions of Ubuntu where the bug got fixed. Trusty and Xenial aren't in the list, so ca-certificates is old there and will cause the OpenSSL issue again and again.

shared-cookbook/recipes/chef_cert_fix.rb

Run the recipe before other cookbooks or include it in a cookbook that may need it. I personally call it during AMI builds as a separate provisioner before running the main cookbook for the AMI.

case node['platform']
when 'ubuntu'
  %w(ca-certificates libgnutls-openssl27).each do |pkg|
    package pkg do
      action :nothing
    end.run_action(:install) # run at compile time
  end

  bash 'update-ca-certificates' do
    chef_ca_cert_path = '/opt/chef/embedded/ssl/certs/cacert.pem'
    code <<-EOC
      sed -i 's/mozilla\/DST_Root_CA_X3.crt/!mozilla\/DST_Root_CA_X3.crt/g' /etc/ca-certificates.conf
      update-ca-certificates --fresh    # `ls -al /etc/ssl/certs/ | grep DST_Root_CA_X3` must return nothing!
      cp -r /etc/ssl/certs/ca-certificates.crt #{chef_ca_cert_path} && chmod 0644 #{chef_ca_cert_path}

      # Now let's allow Chef client to continue communicating with OpsWorks after the certs substitution. 
      # In this example I assume that you somehow preconfigured `~/.chef/knife.rb`  
      # and Knife knows how to contact to your OpsWorks server

      knife ssl fetch && knife ssl check
      cat /root/.chef/ca_certs/AWS_OpsWorks_Root_CA.crt >> #{chef_ca_cert_path}

      # You can replace the `knife` command output with a static file(s) provided by your CI/CD system:
      # cat /etc/chef/trusted_certs/*.pem >> #{chef_ca_cert_path}

      EOC
    returns 0
    group 'root'
    user 'root'
    action :nothing
  end.run_action(:run) # run at compile time

end

shared-cookbook/recipes/chef_cert_fix_trigger.rb

This recipe is an additional trick that extends the main chef_cert_fix.rb. It can be called multiple times injected between other steps in a complicated cookbook where some of the intermediate steps can also update ca-certificates and restore the broken CA cert again ignoring the fact that we tried to delete it.

#
# The file contains an additional step for chef_cert_fix that checks if anything else
# has restored the mozilla/DST_Root_CA_X3.crt certificate. If the file exists, the recipe
# deletes it and triggers auth-refreshment for Chef certificates located in chef_cert_fix.rb:bash[update-ca-certificates]
#
case node['platform']
when 'ubuntu'
  file '/usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt' do
    action :delete
    only_if { File.exist? '/usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt' }
    notifies :run, 'bash[update-ca-certificates]'
  end.run_action(:delete) # run at compile time
end

To use it, include them like this:

somecookbook/recipes/some-action.rb

include_recipe 'shared-cookbook::chef_cert_fix' # include the main recipe before everything
include_recipe 'somecookbook::some-other-action' # assuming that something inside restores the broken CA cert
include_recipe 'shared-cookbook::chef_cert_fix_trigger' # include the additional trick to make sure the broken cert is deleted again 
include_recipe 'somecookbook::yet-another-step' # a step that used to be complaining about the outdated cert

Unfortunately, even this solution is limited and there's a risk that the broken cert may be reinstalled at some moment where Chef cannot control it. The only advice here is to upgrade the Ubuntu itself to the latest.

Other tips

When you connect to the machine (an AMI builder or an instance controlled by Chef where the cookbook is running), you can try debugging SSL issues using the commands:

/opt/chef/embedded/bin/irb

And inside the Ruby shell:

require 'net/http'
Net::HTTP.get(URI.parse("https://YOUR-OPSWORKS-URL.AWS-REGION.opsworks-cm.io:443"))
Net::HTTP.get(URI.parse("https://pkg.jenkins.io/debian-stable/jenkins.io.key"))

Both commands must be successful. If they are not, try using something like the doctor.rb to check what's going on with the certificates:

SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt \
  /opt/chef/embedded/bin/ruby \
  doctor.rb \
  pkg.jenkins.io:443
SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt \
  /opt/chef/embedded/bin/ruby \
  doctor.rb \
  YOUR-OPSWORKS-URL.AWS-REGION.opsworks-cm.io:443

# COMPARE WITH:

SSL_CERT_FILE=/opt/chef/embedded/ssl/certs/cacert.pem \
  /opt/chef/embedded/bin/ruby \
  doctor.rb \
  YOUR-OPSWORKS-URL.AWS-REGION.opsworks-cm.io:443
SSL_CERT_FILE=/opt/chef/embedded/ssl/certs/cacert.pem \
  /opt/chef/embedded/bin/ruby \
  doctor.rb \
  pkg.jenkins.io:443

P.S.: All credit goes to the people who published their recommendations, comments, tools etc. in different places, including:

@clemblanco
Copy link

clemblanco commented Oct 16, 2021 via email

@KIVagant
Copy link

KIVagant commented Oct 16, 2021

@clemblanco, we manage cookbooks differently for historical reasons. When we start an instance, a cookbook run-list is already mentioned in UserData in the launch command. Also, in my specific case, I faced the issue during an AMI build, where Packer starts Chef Solo and the recipe is added to run_list as a separate provisioning step before the main cookbook launch. Doing this I can guarantee that the recipe will be applied and Chef Solo will start again after, picking up all of the settings (this is not necessary, though, just in case).

You can probably try doing something like @willkoehler mentioned in this comment above. Or, simply include the recipe into a cookbook:

current_cookbook/metadata.rb

depends 'shared_cookbook'

current_cookbook/recipes/main.rb

include_recipe 'shared_cookbook::chef_cert_fix'
include_recipe 'current_cookbook::some_action'

Also, I cannot say yet that the issue has gone completely. I still see that something is restoring back ca-certificates with the outdated cert inside, so I'm right now working on yet another improvement for this recipe.

Up: also I added one more recipe chef_cert_fix_trigger.rb in the comment above to always check if /usr/share/ca-certificates/mozilla/DST_Root_CA_X3.crt exists on the file system and refresh certs if it is in case if anything re-installed the outdated and unsupported Xenial/Trusty ca-certificates with that config and cert again.

dotnofoolin pushed a commit to dotnofoolin/opsworks_ruby that referenced this issue Nov 23, 2021
AWS Opsworks uses old, and for long time deprecated version of Chef
(Chef 12). This version includes ROOT CA lists, which are long time
expired - causing most of the scripts to fail. To mimic, a new
configuration param `node['patches']['chef12_ssl_fix']` is introduced,
to include more recent lists.

Fixes ajgon#268

BREAKING CHANGE: By default new list of SSL certificates is used.

It should not affect any of your current deployments, but if you start
seeing SSL errors, the first thing you should check, is disabling
`node['patches']['chef12_ssl_fix']` option.

See ajgon#268 for more
information.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants