Error JWS has invalid anti-replay nonce #318

emilevauge · 2016-11-15T09:21:46Z

Hello

As seen on the Gitter chan, lego should manage errors of type JWS has invalid anti-replay nonce returned by Let's Encrypt.

@xenolf

Well the easy fix is for me to add an error type for this nonce error and return that.

This would be better than nothing :)

The more involved fix is to find why the nonces are invalid.
I had a look around the boulder source and it would be nice to know which case is encountered here

Sadly, I don't know the reason why nonces are invalid in our case. We only get this log:

time="2016-10-25T15:34:59Z" level=error msg="Error renewing certificate: acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce UQ9kdNZTAFPRYEaNVmIEWCEBW3.............."

The text was updated successfully, but these errors were encountered:

xenolf · 2016-11-18T13:34:35Z

@emilevauge Is it possible to have someone who has this problem test the code in the fix_nonce_error branch?

emilevauge · 2016-11-18T15:04:02Z

@xenolf thanks a lot for helping on this :) I will make some tests with your branch and give you some feedback.

tribou · 2016-11-27T18:03:40Z

@xenolf just wanted to report I'm getting the nonce error consistently the past couple days (haven't been able to get a cert using the prod ACME URL yet with numerous retries). I'm using https://github.com/PalmStoneGames/kube-cert-manager which returns errors like these:

2016/11/27 17:18:16 [INFO][example.com] acme: Obtaining bundled SAN certificate
2016/11/27 17:18:17 Error while processing certificate during sync: Error while obtaining certificate for new domain example.com: acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce nMkkF2_SoyGYBSUU7obiZ4h0612cH25ldAzL6Rcphow

2016/11/27 17:18:47 [INFO][example.com] acme: Obtaining bundled SAN certificate
2016/11/27 17:18:47 Error while processing certificate during sync: Error while obtaining certificate for new domain example.com: acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce bGRoCwSZ7ZSijBAuZpgEOhNZRUm_LYMw-MDsS4-4x0U

However, using the staging ACME URL works fine for the same domains. I'm guessing we won't know any more details until a test can be made with the fix_nonce_error branch?

mholt · 2016-11-27T18:28:34Z

If you're using the same account on the live endpoint as you are on staging, you'll get this error. I think.

cpu · 2016-11-28T13:51:55Z

If you're using the same account on the live endpoint as you are on staging, you'll get this error. I think.

That's correct - nonce's are not transportable between the two environments.

jipperinbham · 2016-11-28T16:28:19Z

I'm hitting this problem as well and seems to be limited to cases where we're issuing a domain with a SAN and it always seems to throw an error when calling getChallenges for 1 of the 2 domains.

tribou · 2016-11-28T16:47:22Z

Thanks for the info so far, everyone!

To give some more backstory on my project, I'm currently trying to migrate from letsencrypt-express to kube-cert-manager. So one of the domains I attempted already has a working prod cert with letsencrypt-express. However, the second domain was just a one-off test that I only ran in kube-cert-manager; but I did use the same email initially to test staging and then prod for that one-off test.

For clarification, are the nonce's stored on the Let's Encrypt side and associated by email? Or are they stored by the client library's implementation (boltdb used by kube-cert-manager), and perhaps I just need to wipe out the boltdb database and try again for just prod?

...or *gulp*, will I need to find some way to sync the letencrypt-express account meta files with the kube-cert-manager meta info stored in boltdb?

tribou · 2016-11-28T17:39:56Z

Wiping out the existing staging meta data worked!

So for anyone hitting this issue in the future, this is what worked for me:

Test using https://acme-staging.api.letsencrypt.org/directory (staging certs) to make sure your LE implementation is correct.
Find where your LE library stores its account metadata and delete that file/directory (kube-cert-manager uses a data.db boltdb file).
Update to using the https://acme-v01.api.letsencrypt.org/directory prod URL, and redeploy/re-run your library's cert acquisition.

This should force the Let's Encrypt negotiation process to regenerate new account info for prod.

mholt · 2016-11-28T23:31:15Z

Sounds like your application was using the wrong account for the transaction ;) Glad you figured it out.

BusyBusinessCat · 2017-01-09T20:33:23Z

I have also encountered this issue, but it was a little different from what I've been reading here.

At the beginning, I've issued a staging cert to check my lego installation and app configuration was correct. As everything was ok, I switched to standard mode with success, just removing the "--server staging-url" from my automatic lego call (and thus, keeping the same email address) allowed me to get a LE certificate.

Things started to begin strange, right after the expiration of the staging cert, at the next renew of my cert.

I was having the acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce at each call. Reading this thread, I tried to remove the old staging account info (and the standard one too), it didn't help.

Changing the email to one that I have never used with LE worked directly.

I can assume that, for a reason I don't know, we cannot use the same email in staging and in classic LE, but in that case I do not understand why it was perfectly working in the first place.

Maybe someone here can explain me what happened ? Or maybe I missed something ?

cpu · 2017-01-09T20:52:15Z

Hi folks, just passing by & wanted to answer a few Q's since my last reply on-thread.

I can assume that, for a reason I don't know, we cannot use the same email in staging and in classic LE

There's no constraint like this from the Let's Encrypt side. You can use the same email for staging and production without causing errors.

For clarification, are the nonce's stored on the Let's Encrypt side and associated by email? Or are they stored by the client library's implementation (boltdb used by kube-cert-manager)

Nonces aren't associated with an email/account on the server side. Roughly speaking, for a given environment (staging/prod) they are simply a number given to a client and noted on a list. There's no additional metadata. As mentioned earlier they do not work across environments (e.g. a nonce from staging is unknown to the prod environment), each env maintains an independent nonce list.

I'm not familiar with how Lego stores its nonces. Internally it could be using its own binding with the account email address - if so then it seems like it would be a bug if that nonce can end up reused across a switch from staging to prod.

BusyBusinessCat · 2017-01-09T21:03:19Z

Did lego store account info somewhere else than in the working folder (inside the "account" folder) ?

I managed to remove everything related to accounts when I was trying to keep the same bogus mail, but I always ended with the nonce error (it was a different nonce each time, by the way).

I can try to reproduce it, putting some extra logs where I can if someone think it's useful.

@cpu Thanks for clarification about the LE accounting, in fact that's what I was expecting from LE, so I really don't understand why changing email make it work for me.

mithrandi · 2017-01-25T11:02:13Z

The nonces are only valid for an hour or so, as far as I know; if they are being stored for longer than that, or the server-side nonce store is purged (eg. I think this happens when Boulder is restarted) the stored nonce will be invalid. I think the easiest way to handle this is to retry any request that fails with an "invalid nonce" error, using the new nonce returned along with the response; this should handle pretty much all of the common scenarios without any complicated logic required.

emilevauge · 2017-01-25T11:29:37Z

I think the easiest way to handle this is to retry any request that fails with an "invalid nonce" error, using the new nonce returned along with the response; this should handle pretty much all of the common scenarios without any complicated logic required.

I agree with this. We are still getting this error once in a while on Traefik and I would love lego to retry this kind of request ;)

xenolf · 2017-01-27T15:30:22Z

I will implement this over the weekend.

emilevauge · 2017-01-27T17:33:57Z

@xenolf awesome 😍

ubershmekel · 2017-02-19T02:54:04Z

@mholt or @xenolf can you reopen this issue? Has the retry been implemented? I got this nonce problem when using traefik. The problem was repeating itself until I added a dot to my gmail address to cause a fresh transaction to take place.

time="2017-02-19T02:13:17Z" level=error msg="map[www.example.com:acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce 91WN0.....nE0SindU]" 
time="2017-02-19T02:13:17Z" level=error msg="Error getting ACME certificates [www.example.com] : Cannot obtain certificates map[www.example.com:acme: Error 400 - urn:acme:error:badNonce - JWS has invalid anti-replay nonce 91WN0.....nE0SindU]+v"

mholt · 2017-02-19T03:38:15Z

I'm still not convinced this is not a cross-account reuse problem. Yes, of course you can use the same email address with different ACME CAs, but you cannot use the same account credentials between them.

@ubershmekel See, your error and how you fixed it seems to reinforce this idea. By changing your email address, the client created a new account with the server with new credentials instead of reusing an account created for another server.

ubershmekel · 2017-02-19T19:02:10Z

It seems this may be an issue with lets encrypt itself.

We just finished up reverting a CDN config change that was causing this problem. There was caching in some places where there should not be.

...

I also found that I had to change email address to get letsencrypt to give me new certificates.

letsencrypt/boulder#1217

cpu · 2017-02-20T14:48:25Z

@ubershmekel I followed up on the linked Boulder #1217 thread from Jan 2016 - this is not a related issue with Let's Encrypt itself. As @mholt mentioned changing contact information is not a fix but likely ends up producing a fresh nonce or somehow otherwise working around the true underlying issue.

tribou mentioned this issue Nov 27, 2016

Error: JWS has invalid anti-replay nonce PalmStoneGames/kube-cert-manager#23

Closed

mholt closed this as completed Nov 28, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error JWS has invalid anti-replay nonce #318

Error JWS has invalid anti-replay nonce #318

emilevauge commented Nov 15, 2016

xenolf commented Nov 18, 2016

emilevauge commented Nov 18, 2016

tribou commented Nov 27, 2016

mholt commented Nov 27, 2016

cpu commented Nov 28, 2016

jipperinbham commented Nov 28, 2016

tribou commented Nov 28, 2016

tribou commented Nov 28, 2016

mholt commented Nov 28, 2016

BusyBusinessCat commented Jan 9, 2017

cpu commented Jan 9, 2017

BusyBusinessCat commented Jan 9, 2017 •

edited

Loading

mithrandi commented Jan 25, 2017

emilevauge commented Jan 25, 2017

xenolf commented Jan 27, 2017

emilevauge commented Jan 27, 2017

ubershmekel commented Feb 19, 2017

mholt commented Feb 19, 2017

ubershmekel commented Feb 19, 2017 •

edited

Loading

cpu commented Feb 20, 2017 •

edited

Loading

Error JWS has invalid anti-replay nonce #318

Error JWS has invalid anti-replay nonce #318

Comments

emilevauge commented Nov 15, 2016

xenolf commented Nov 18, 2016

emilevauge commented Nov 18, 2016

tribou commented Nov 27, 2016

mholt commented Nov 27, 2016

cpu commented Nov 28, 2016

jipperinbham commented Nov 28, 2016

tribou commented Nov 28, 2016

tribou commented Nov 28, 2016

mholt commented Nov 28, 2016

BusyBusinessCat commented Jan 9, 2017

cpu commented Jan 9, 2017

BusyBusinessCat commented Jan 9, 2017 • edited Loading

mithrandi commented Jan 25, 2017

emilevauge commented Jan 25, 2017

xenolf commented Jan 27, 2017

emilevauge commented Jan 27, 2017

ubershmekel commented Feb 19, 2017

mholt commented Feb 19, 2017

ubershmekel commented Feb 19, 2017 • edited Loading

cpu commented Feb 20, 2017 • edited Loading

BusyBusinessCat commented Jan 9, 2017 •

edited

Loading

ubershmekel commented Feb 19, 2017 •

edited

Loading

cpu commented Feb 20, 2017 •

edited

Loading