Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ACME] Too many currently pending authorizations #905

Closed
Mika56 opened this issue Nov 27, 2016 · 16 comments
Closed

[ACME] Too many currently pending authorizations #905

Mika56 opened this issue Nov 27, 2016 · 16 comments

Comments

@Mika56
Copy link

Mika56 commented Nov 27, 2016

Hi,

I'm migrating my infrastructure to Docker. I was running an Apache server on one virtual machine, with only a few websites protected with TLS.
I'm now running Traefik in a Docker Swarm cluster composed of three hosts, and want to go full TLS (Let's Encrypt).

At first it worked correctly, but now it seems to be behaving oddly.
My Traefik gets its configuration from Consul, and stores the ACME certificates there too. acme/onDemand used to be set to true, but I've now set it to false.
Even with onDemand set to false, my log file keeps growing, with these messages:

time="2016-11-27T16:37:40Z" level=error msg="map[portus.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]"
time="2016-11-27T16:37:40Z" level=error msg="Error getting ACME certificates [portus.something1.com] : Cannot obtain certificates map[portus.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]+v"
time="2016-11-27T16:37:40Z" level=error msg="map[jira.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. jira.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]"
time="2016-11-27T16:37:40Z" level=error msg="Error getting ACME certificates [jira.something1.com jira.something2.com] : Cannot obtain certificates map[jira.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. jira.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]+v"
time="2016-11-27T16:37:40Z" level=error msg="map[confluence.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. confluence.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]"
time="2016-11-27T16:37:40Z" level=error msg="Error getting ACME certificates [confluence.something1.com confluence.something2.com] : Cannot obtain certificates map[confluence.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. confluence.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]+v"
time="2016-11-27T16:37:40Z" level=error msg="map[stash.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. stash.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. bitbucket.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. bitbucket.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]"
time="2016-11-27T16:37:40Z" level=error msg="Error getting ACME certificates [stash.something1.com bitbucket.something1.com stash.something2.com bitbucket.something2.com] : Cannot obtain certificates map[stash.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. bitbucket.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. bitbucket.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. stash.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]+v"
time="2016-11-27T16:37:40Z" level=error msg="map[stats.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. stats.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]"
time="2016-11-27T16:37:40Z" level=error msg="Error getting ACME certificates [stats.something2.com stats.something1.com] : Cannot obtain certificates map[stats.something1.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. stats.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]+v"

So, obviously, Traefik is trying to get new certificates, but too much authorizations are pending.
I've opened Consul and read my acme/account/object value, which is where things gets odd:

{
    "Email":"...",
    "Registration":{
        "body":{
            "resource":"reg",
            "id":0000,
            "key":{
            "kty":"RSA",
            "n":"...",
            "e":"AQAB"
            },
            "contact":[
                "mailto:..."
            ],
            "agreement":"https://letsencrypt.org/documents/LE-SA-v1.1.1-August-1-2016.pdf"
        },
        "uri":"https://acme-v01.api.letsencrypt.org/acme/reg/0000",
        "new_authzr_uri":"https://acme-v01.api.letsencrypt.org/acme/new-authz",
        "terms_of_service":"https://letsencrypt.org/documents/LE-SA-v1.1.1-August-1-2016.pdf"
    },
    "PrivateKey":"...",
    "DomainsCertificate":{
        "Certs":[
            {
                "Domains":{
                    "Main":"stash.something2.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"stash.something2.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"stash.something2.com",
                    "SANs":[
                        "bitbucket.something2.com"
                    ]
                },
                "Certificate":{
                    "Domain":"stash.something2.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"confluence.something2.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"confluence.something2.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"jira.something2.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"jira.something2.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"confluence.something2.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"confluence.something2.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"prtg.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"prtg.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"prtg.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"prtg.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"prtg.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"prtg.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"prtg.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"prtg.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"stats.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"stats.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"stats.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"stats.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"stats.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"stats.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"stats.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"stats.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"glados.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"glados.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"glados.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"glados.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"glados.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"glados.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            },
            {
                "Domains":{
                    "Main":"glados.something1.com",
                    "SANs":null
                },
                "Certificate":{
                    "Domain":"glados.something1.com",
                    "CertURL":"https://acme-v01.api.letsencrypt.org/acme/cert/...",
                    "CertStableURL":"",
                    "PrivateKey":"...",
                    "Certificate":"..."
                }
            }
        ]
    },
    "ChallengeCerts":{
        "bitbucket.something2.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        },
        "confluence.something2.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        },
        "jira.something2.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        },
        "prtg.something2.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        },
        "stash.something2.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        },
        "stash.something1.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        },
        "stats.something2.com":{
            "Certificate":"...",
            "PrivateKey":"..."
        }
    }
}

It seems to me that I have multiple times the same certificates under DomainsCertificate.
However, some of the domains that are multiple times works, while some others don't.

I've also extracted one of the non-working certificate, and analysed it. It seems to me everything should work properly?

➜  traefik_tls openssl x509 -in stats.cert -text -noout
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            (...)
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=US, O=Let's Encrypt, CN=Let's Encrypt Authority X3
        Validity
            Not Before: Nov 23 10:33:00 2016 GMT
            Not After : Feb 21 10:33:00 2017 GMT
        Subject: CN=stats.something1.com
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (4096 bit)
                Modulus:
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage:
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier:
                75:D9:8C:ED:71:C5:D5:A1:93:29:A2:C3:A4:9A:AA:27:85:BA:68:0C
            X509v3 Authority Key Identifier:
                keyid:A8:4A:6A:63:04:7D:DD:BA:E6:D1:39:B7:A6:45:65:EF:F3:A8:EC:A1

            Authority Information Access:
                OCSP - URI:http://ocsp.int-x3.letsencrypt.org/
                CA Issuers - URI:http://cert.int-x3.letsencrypt.org/

            X509v3 Subject Alternative Name:
                DNS:stats.something1.com
            X509v3 Certificate Policies:
                Policy: 2.23.140.1.2.1
                Policy: 1.3.6.1.4.1.44947.1.1.1
                  CPS: http://cps.letsencrypt.org
                  User Notice:
                    Explicit Text: This Certificate may only be relied upon by Relying Parties and only in accordance with the Certificate Policy found at https://letsencrypt.org/repository/

    Signature Algorithm: sha256WithRSAEncryption
         (...)

So, my questions are:

  • How can I stop Traefik from requesting certificates against Let's Encrypt? From what I read, I need to stop sending authorizations requests for a week before I can send requests again
  • Is it normal that I have multiple times the same certs under DomainsCertificate (at least the same name, I haven't compared the certificates themselves)?
  • Can you add additional checks so Traefik does not try to generate certificates every minutes when it is rate limited?
  • Do I need to clean my acme/account/object value to remove incorrect certificates?
@ralphtheninja
Copy link

Which version of traefik are you running? (Just curious)

@emilevauge
Copy link
Member

emilevauge commented Nov 28, 2016

Which version of traefik are you running? (Just curious)

I assume it's 1.1.0 as you store certs in Consul, right?
Another point: how many Traefik instance are deployed on your cluster?

@Mika56
Copy link
Author

Mika56 commented Nov 28, 2016

Hi,

Sorry, I forgot to mention, I'm indeed running containous/traefik:v1.1.0.
My args are --consul.endpoint=consul:8500 --watch=true --web.address=:8081 --docker.endpoint=unix:///var/run/docker.sock --docker.exposedbydefault=false --docker.swarmmode=true --docker.watch=true.
My service is running in global mode, with a constraint of node.role == manager, which means all my hosts :)
My cluster is composed of three hosts, so there's no way I would have more than three instances.

@Mika56
Copy link
Author

Mika56 commented Nov 30, 2016

It seems some of my certificates were generated, the limits must have expired, but the problem might still present.
Also, I keep getting this in my logs:

time="2016-11-30T11:23:42Z" level=error msg="Datastore sync error: Object lock value: expected 3e7f693f-7ecc-4ba2-ac0e-398cfd6300a3, got f41a46ab-525c-4022-89a8-290a605b195a, retrying in 416.67653ms"
time="2016-11-30T11:23:55Z" level=error msg="Datastore sync error: Object lock value: expected 4f58be80-3309-4696-a02d-40cb04fd13d1, got 3e7f693f-7ecc-4ba2-ac0e-398cfd6300a3, retrying in 636.040823ms"
time="2016-11-30T11:24:19Z" level=error msg="Datastore sync error: Object lock value: expected 0dcfbe58-5205-465b-b6ab-610058fb2872, got 7bac4f91-f2b6-446b-afef-978170bbdd7e, retrying in 301.330653ms"
time="2016-11-30T11:24:36Z" level=error msg="Datastore sync error: Object lock value: expected a5b7f463-e716-42ed-90f9-b2aa3867755b, got 5b32b165-6f82-4716-8dd4-f729db24b772, retrying in 271.043733ms"

I don't get this all the time (I didn't have it when I initially posted the issue, but have had before), but have it now and keeps logging every few minutes.

@Mika56
Copy link
Author

Mika56 commented Dec 2, 2016

Any way I can help you track down this bug? This morning, all my remaining certificates were signed, so I started adding new services, but the same problem occurred again... :(

I've done two things to my Traefik:

  • Add a new backend and a new frontend in my KV store (I'm using the alias system)
  • Restarted one of my container, at first without problem

Traefik then logged:

2016/12/02 12:26:29 server.go:2317: http: TLS handshake error from 10.255.0.3:55412: Cannot obtain certificates map[gitlab.something.com:acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce *someIDImnotsureifsecret*]+v

And only a few minutes later:

time="2016-12-02T12:32:37Z" level=error msg="map[jira.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations. jira.something.com:acme: Error 429 - urn
:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]"
time="2016-12-02T12:32:37Z" level=error msg="Error getting ACME certificates [jira.something;com jira.something2.com] : Cannot obtain certificates map[jira.something2.com:acme: Error 429 - urn:acme:error:rateLimited - Error
 creating new authz :: Too many currently pending authorizations. jira.something.com:acme: Error 429 - urn:acme:error:rateLimited - Error creating new authz :: Too many currently pending authorizations.]+v"

On my end, both jira.something.com and jira.something2.com have valid certificates.
I wonder if Traefik started generated authorizations after the nonce error, then hit the limit six minutes later?

@emilevauge emilevauge added this to the 1.1 milestone Dec 2, 2016
@emilevauge
Copy link
Member

@Mika56
It seems you are hitting multiple diffenret issues here.

1/ JWS has invalid anti-replay nonce
This issue seems to be due to go-acme/lego#318 (comment).

2/ Duplicate certificates
Were you using an older version of Traefik (v1.0.x) before?
It seems an older version were sometimes generating duplicate certificates.
As a workaround, I will clean up certificates list at startup in v1.1.2.

@emilevauge emilevauge added priority/P1 need to be fixed in next release level/advanced labels Dec 2, 2016
@Mika56
Copy link
Author

Mika56 commented Dec 2, 2016

The issue you mentioned was closed a few days ago, but with no fix, shouldn't you reopen it?
As for my Traefik versions, I've never used any version older than 1.1 on this infrastructure, so that can't be related.

@emilevauge
Copy link
Member

@Mika56 Please read entirely the issue ;) This is due to using the same account with prod/staging LE.

@Mika56
Copy link
Author

Mika56 commented Dec 2, 2016

Sorry, I didn't understand that when I first read it. Anyway, I'm not sure if I'm concerned, I'm not sure but I don't think I've used the same private key in the staging environment, plus I was able to generate some certificates...
Is there any way to regenerate my account private key? Will that break my certificates?

@emilevauge
Copy link
Member

emilevauge commented Dec 2, 2016

plus I was able to generate some certificates...

I know, me too :'(

Is there any way to regenerate my account private key? Will that break my certificates?

Remove your account info. Yes, sadly, you will have to generate your cert again.

This is really bad that LE allows to use the same key from staging to production and then produce some random errors...

@Mika56
Copy link
Author

Mika56 commented Dec 2, 2016

Any way to force Traefik to generate certificates in a given order? If I have to regenerate all my certificates, I'll hit many limits, and while I can disable automatic generation with onDemand=false, there's, as far as I can tell, no way to manually generate an ACME certificate

@emilevauge
Copy link
Member

emilevauge commented Dec 2, 2016

I suggest to backup all your ACME config first (account + certs). Then you can generate a new account (deleting ACME your account in traefik config). Then delete your cert 5 by 5 (to avoid rate limiting) and force the generation of new certificates in traefik ACME config filling acme.domains.

@emilevauge
Copy link
Member

Fixed by #972

@shankie-codes
Copy link

Hmm I'm still getting the JWS has invalid anti-replay nonce, even after using the latest version. I can force new certificates by changing the email address in my ACME account, but I have to do this for each new domain that I add (and in my case, I'm running a RP for about 30 client sites, so that's a bit tedious).

I've got the following in my traefik.toml, but it shouldn't make a difference:

onDemand = false
OnHostRule = true

@ldez ldez removed the priority/P1 need to be fixed in next release label May 30, 2017
@holms
Copy link

holms commented Jun 13, 2018

@Mika56 how did you solve Datastore sync error: Object lock value: problem?

@Mika56
Copy link
Author

Mika56 commented Jun 13, 2018

I did not, my Traefik instances keep screaming that error, LE only renews certificates when I restart the service docker service update --force traefik, but I generally have to restart it multiple times for every certificate to be renewed properly

@traefik traefik locked and limited conversation to collaborators Sep 1, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants