Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Possible BUG] Docker: Non Functional Certificate OCSP Certificate Revocation Check #2667

Closed
lwillek opened this issue Feb 22, 2025 · 13 comments · Fixed by #2698
Closed

[Possible BUG] Docker: Non Functional Certificate OCSP Certificate Revocation Check #2667

lwillek opened this issue Feb 22, 2025 · 13 comments · Fixed by #2698
Labels
3.0 old branch 3.2 upcoming release bug:minor

Comments

@lwillek
Copy link

lwillek commented Feb 22, 2025

Hello @drwetter

I think I found a possible bug when checking the certificate revocation status when using docker. The OCSP check did not succeed for me as expected, resulted in incorrect outputs and too high ratings in case a certificate has been actually revoked.

To exclude any issues with my own certificates or my own environment, I ended up using the revoked example from badssl.com, and tested with the Docker images you provide at Docker Hub. I could reproduce the issues I have seen on my local environment.

It is worth pointing out that if I git clone testssl and run it locally, then everything is fine, the correct results are shown.

Short Issue Description

  • I do expect that in case a certificate is actually revoked, the OCSP check should fail with a critical status, should indicate the "revoked" state, and should cap the overall rating (grade) at T.
  • However, when using testssl.sh in versions 3.0 or 3.2, the OCSP check produces a warning only, stating "empty OCSP response."

Expected Behavior: Version 3.1dev

The expected behave is seen when using the older drwetter/testssl.sh:3.1dev docker container:

$ docker run --rm -t -v $(pwd):/d drwetter/testssl.sh:3.1dev --quiet --wide --color 3 --full --phone-out --jsonfile-pretty /d/o.json --severity CRITICAL revoked.badssl.com

[output shortened]
 OCSP URI                     http://e6.o.lencr.org, revoked
[output shortened]
 Overall Grade                T
 Grade cap reasons            Grade capped to T. Certificate revoked

$ jq '.scanResult[] | with_entries(select(.value | length > 0))' o.json
{
  "targetHost": "revoked.badssl.com",
  "ip": "104.154.89.105",
  "port": "443",
  "rDNS": "105.89.154.104.bc.googleusercontent.com.",
  "service": "HTTP",
  "serverDefaults": [
    {
      "id": "cert_ocspRevoked",
      "severity": "CRITICAL",
      "finding": "revoked"
    }
  ],
  "rating": [
    {
      "id": "overall_grade",
      "severity": "CRITICAL",
      "finding": "T"
    }
  ]
}

Actual Behavior: Version 3.2

This version throws a "Segmentation fault" as well.

$ docker run --rm -t -v $(pwd):/d drwetter/testssl.sh:3.2 --quiet --wide --color 3 --full --phone-out --jsonfile-pretty /d/o.json --severity CRITICAL revoked.badssl.com

[output shortened]
 OCSP URI                     http://e6.o.lencr.org/usr/local/bin/testssl.sh: line 2044: 20998 Segmentation fault      $OPENSSL ocsp -no_nonce ${host_header} -url "$uri" -issuer $TEMPDIR/hostcert_issuer.pem -verify_other $TEMPDIR/intermediatecerts.pem -CAfile <(cat $ADDTL_CA_FILES "$GOOD_CA_BUNDLE") -cert $HOSTCERT -text &> "$tmpfile"
, error querying OCSP responder (empty ocsp response)

[output shortened]

 Cipher Strength  (weighted)  90 (36)
 Final Score                  94
 Overall Grade                B


$ jq '.scanResult[] | with_entries(select(.value | length > 0))' o.json
{
  "targetHost": "revoked.badssl.com",
  "ip": "104.154.89.105",
  "port": "443",
  "rDNS": "105.89.154.104.bc.googleusercontent.com.",
  "service": "HTTP",
  "serverDefaults": [
    {
      "id": "cert_ocspRevoked",
      "severity": "WARN",
      "finding": "empty ocsp response"
    }
  ]
}

Actual Behavior: Version 3.0

I could reproduce the issue there as well:

$ docker run --rm -t -v $(pwd):/d drwetter/testssl.sh:3.0 --quiet --wide --color 3 --full --phone-out --jsonfile-pretty /d/o.json --severity CRITICAL revoked.badssl.com

[output shortened]
 OCSP URI                     http://e6.o.lencr.org, error querying OCSP responder

$ jq '.scanResult[] | with_entries(select(.value | length > 0))' o.json
{
  "targetHost": "revoked.badssl.com",
  "ip": "104.154.89.105",
  "port": "443",
  "rDNS": "105.89.154.104.bc.googleusercontent.com.",
  "service": "HTTP",
  "serverDefaults": [
    {
      "id": "cert_ocspRevoked",
      "severity": "WARN",
      "finding": ""
    }
  ]
}

Additional Information

$ docker images -f "reference=drwetter/testssl.sh"
REPOSITORY            TAG       IMAGE ID       CREATED         SIZE
drwetter/testssl.sh   3.2       b61c7e88d2e9   2 days ago      60.3MB
drwetter/testssl.sh   3.0       9ecedd099c64   2 days ago      42.7MB
drwetter/testssl.sh   3.1dev    08f3f6548eef   19 months ago   55.5MB

This is how I tested manually. The only reason here to use the docker image is to exclude any local configuration issues on my end.

$ docker run -it --entrypoint /bin/bash drwetter/testssl.sh:3.2

bash-4.4$ # Get the certificates and store them as seperate file in /tmp
bash-4.4$ echo | openssl s_client -connect revoked.badssl.com:443 -showcerts 2>/dev/null | sed -n '/-----BEGIN/,/-----END/p' | awk '/-----BEGIN/{f="cert"++i".pem"} {print > "/tmp/"f}'
bash-4.4$ ls /tmp/cert*
/tmp/cert1.pem	/tmp/cert2.pem

bash-4.4$ # Get the OCSP URL
bash-4.4$ openssl x509 -noout -ocsp_uri < /tmp/cert1.pem
http://e6.o.lencr.org

bash-4.4$ # Perform the OCSP Request
bash-4.4$ openssl ocsp -issuer /tmp/cert2.pem -cert /tmp/cert1.pem -url http://e6.o.lencr.org -noverify
/tmp/cert1.pem: revoked
	This Update: Feb 22 05:42:00 2025 GMT
	Next Update: Mar  1 05:41:58 2025 GMT
	Reason: keyCompromise
	Revocation Time: Dec 19 19:19:49 2024 GMT
@lwillek lwillek changed the title [Possible BUG] Certificate OCSP Certificate Revocation Check non functional [Possible BUG] Non Functional Certificate OCSP Certificate Revocation Check Feb 22, 2025
@lwillek lwillek changed the title [Possible BUG] Non Functional Certificate OCSP Certificate Revocation Check [Possible BUG] Docker: Non Functional Certificate OCSP Certificate Revocation Check Feb 22, 2025
@lwillek
Copy link
Author

lwillek commented Feb 22, 2025

I was concerned that my local Docker environment (macOS on an M1, running Rancher Desktop) might have influenced the results, as the image's platform (linux/amd64) does not match the host platform (linux/arm64/v8). So I tested also on another Docker host (Debian x86_64) and successfully reproduced the behavior described above. So I expect that the examples should be reproducible as they are.

@lwillek
Copy link
Author

lwillek commented Feb 22, 2025

Sigh... apparently, this is one of those cases where it's easy to kill hours.

I have unsuccessfully tried to recreate the issue without using the official docker container.

I started by compiling openssl-1.0.2.bad myself from source — when testing everything was alright. Next I rebuilt the docker container according to the instructions (whether based on opensuse or alpine) - everything was ok. Even when using the native OpenSSL version which is shipped within the container - everything just works as it should:

docker run --rm drwetter/testssl.sh:3.2 --openssl /usr/bin/openssl --phone-out -S revoked.badssl.com

What I could not figure out is how exactly replicating the build process for the Docker images you provide at Docker Hub. It seems there is some "magic" involved, which copies the openssl-1.0.2.bad binaries to "/home/testssl/bin". edit: I found out this part, its there already... 😏.

I don't want to look too far into it, or I might not see the daylight again 😅. Hence the following just as a gut feeling: Could it be that the binary was not built with nsswitch in mind? The reason I ask is because it behaves exactly as I would have it expected 15 or maybe even 20 years in the past.

To sum up: I think the true nature of the issue is not testssl.sh itself, not even with openssl-1.0.2.bad in general, but it has something to do with the way how /home/testssl/bin/openssl.Linux.x86_64 is build and than shipped.

It seems the issue is only traceable when using the precompiled openssl-1.0.2.bad binaries. After going down deep this rabbit hole, I think this issue here is a possible duplicate of #2516

At least - now it can be replicated easily - by using the official docker containers.

@drwetter drwetter added bug:minor 3.2 upcoming release 3.0 old branch and removed bug:to be reproduced ... from maintainers bug:needs triage/confirmation labels Mar 3, 2025
@drwetter
Copy link
Collaborator

drwetter commented Mar 3, 2025

Thanks for reporting.

In fact the segfault smells a bit like #2516 . However the docker container for 3.2 doesn't contain any dubious name service switch which we blamed #2516 for (see comment).. The host here runs Debian 12:

Image

To my dismay ;-) the (/a?) segfault happened also there (dmesg: openssl.Linux.x[86160]: segfault at 63 ip 00007f49c21c5791 sp 00007ffdf8e64f10 error 4 in libc.so.6[7f49c2069000+1fb000] likely on CPU 1 (core 0, socket 1))

Can´t tell yet whether it is the same problem or something different.

Even when using the native OpenSSL version which is shipped within the container - everything just works as it should:

Yes, that wasn't statically compiled and is in line with the base image. If I understood you correctly, compiling openssl-bad by yourself and copying it into the container, it didn't segfault, right? If so: did you compile it with the script, so that you had a static binary?

@drwetter
Copy link
Collaborator

drwetter commented Mar 3, 2025

PS + to just keep it in mind:: The alpine image for 3.0 which doesn't throw a segfault but leads to a wrong result has the same nss for hosts:

Image

@drwetter
Copy link
Collaborator

drwetter commented Mar 3, 2025

The segfault happens under Debian 12 host also with the new binary *) within the container.

*) for testing see https://testssl.sh/contributed_binaries/openssl.Linux.x86_64

drwetter added a commit that referenced this issue Mar 14, 2025
…ne-out

As `--phone-out` sometimes doesn't work with our binary we switch transparently/automagically
to the vendor support openssl binary -- if available.

This fixes at least #2516 where the issue has been explained/debugged in detail.
See also #2667 and #1275.
drwetter added a commit that referenced this issue Mar 14, 2025
…ne-out (3.0)

As `--phone-out` sometimes doesn't work with our binary we switch transparently/automagically
to the vendor support openssl binary -- if available. This is the PR for 3.0, for 3.2 see #2695 .

This fixes at least #2516 where the issue has been explained/debugged in detail.
See also #2667 and #1275.
@drwetter
Copy link
Collaborator

FYI @lwillek : The segfault is gone but it still says error querying OCSP responder

@drwetter
Copy link
Collaborator

Strange: If I exec into the container and start testssl from there it works:

Image

@drwetter
Copy link
Collaborator

Before you spend additional time on this @lwillek : I spotted the problem and will provide a fix ~tommorow.

@lwillek
Copy link
Author

lwillek commented Mar 14, 2025

Thanks for reporting.

It is a pleasure. Thank you - for picking, digging into, and implementing a workaround until the static binary has been newly compiled. (or whatever the final fix will be)

Before you spend additional time on this..

Hehe, too late. Time spend already, as my (way too long) comment was almost completed 😁 ... deleted. Good to know that you found something!

I can't wait to see what the reason was. I am happy to test tomorrow!

@drwetter
Copy link
Collaborator

I can't wait to see what the reason was. I am happy to test tomorrow

The last problem was me 😄 and you'll see in the PR later. The original problem was that a static binary with glibc has some problems with gethostbyname(3) which only seems only seems to happen when using ocsp queries with an URL. That's not fixable . Workaround is #2695.

@lwillek
Copy link
Author

lwillek commented Mar 15, 2025

So true 🤡 - unsolvable when trying to statically compile with glibc. I learned that too the hard way yesterday. My yesterday's journey made me realize that I haven't done enough compiling myself in the last 10 years, but that's another story.

I ended up this night in attempting to compile openssl-1.0.2.bad statically using musl libc and gcc < 14. Why, you may ask?

The musl libc is designed to facilitate static linking and for compatibility with standard C functions, including getaddrinfo. So it seemed to me worth a try. And when successful, also a good solution which addresses the underlaying core issue. On distributions like Alpine Linux musl is the default C library, so I ended up using alpine:3.18 for my tests.

...But then I stopped my attempt (and also the comment yesterday evening), as you found a solution. The workaround #2695 is fine. Thanks again.

@drwetter
Copy link
Collaborator

you're welcome!

We had Alpine docker images with musl libc now only in 3.0 (where also you encountered the issue). There were performance issues which got better when upgrading the Alpine version. Also IIRC there were issues when using glibc binaries, like other projects encountered.

So to me musl libc could have been a solution, but OTOH we might run into other problems. Thus, the easiest solution is from >=3.3dev on to use any vendor supplied openssl binary as the primary one.

drwetter added a commit that referenced this issue Mar 15, 2025
One positive, one negative

This should detect failures in the future like in #2667, #2516
and #1275 .
@drwetter
Copy link
Collaborator

FYI: Fixed now. Also there's a unit test which warns when there's a similar problem again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.0 old branch 3.2 upcoming release bug:minor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants