Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jwks: start attempting to refresh cache right away #264

Closed
wants to merge 1 commit into from

Conversation

dlwyatt
Copy link

@dlwyatt dlwyatt commented Jul 26, 2020

Existing code doesn't start attempting to refresh the remote keyset cache until it gets a call to Verify() that fails to verify with one of the existing cached keys. This can result in a situation where a new key has been present in the remote keyset for some time, but go-oidc doesn't know about it until the first time it encounters a JWT signed with that new key (even if the cache expired some time before that). If the HTTP request to refresh the cache fails at this point, the JWT validation also fails.

This small enhancement will still return successful validations right away based on cached keys, but will start attempting to renew the cache right away in the background. That way a spotty internet connection might still find the cache properly renewed before the new key is actually used on a JWT.

@ericchiang
Copy link
Collaborator

Can you take a look at #259 which I merged for the V3 branch that removes the expiry logic?

I think doing more frequent refreshes would be good, but some issues that come to mind are:

  • We don't want to swallow errors
  • This package currently doesn't log anything, all errors are returned to the caller

Ideally NewRemoteKeySet() would just create a background goroutine that occasionally refreshed the key set (maybe once every minute?) but then we'd need to figure out how to surface errors to the caller and potentially backoff. Maybe require the caller to pass a logger to create a remote key set?

As a hack, you can feed a token signed by an invalid key every now and again to force it to refresh the remote keys.

How are you using this package where you're seeing this issue? Is it through another project or are you using go-oidc directly?

@dlwyatt
Copy link
Author

dlwyatt commented Jul 26, 2020

I'm just starting to use the package and reviewing the code; I'll be using go-oidc directly to verify tokens for an Azure AD app. I haven't actually run into any issues with this; the code just jumped out at me as a place where intermittent network blips hitting the jwks URI could cause easily avoidable errors. Interesting that the OIDC spec recommends the same approach though: The verifier knows to go back to the jwks_uri location to re-retrieve the keys when it sees an unfamiliar kid value.. Why wait that long to re-retrieve when the new keys have probably been published at the jwks_uri for a while before the provider actually started signing anything with them?

I agree that background refreshing the cache on a schedule would be better than having it driven by client requests. Wasn't sure if that type of a change would be accepted here; what I did was a minimal modification of the existing code. Refreshing every minute is excessive; refresh interval would typically be around once per day, but may depend on the provider.

I don't personally mind swallowing errors on the background cache refreshes because they're eventually going to surface if a caller encounters a signing key that isn't yet in the cache (same as it happens today). The background refresh attempts are just there to get ahead of that, so there are potentially multiple attempts to fetch the latest JWKS before a verification error eventually occurs. Logging is fine too, of course. :)

@ericchiang
Copy link
Collaborator

Good to hear that someone actually reviews their dependencies :)

This strategy was used because it's straightforward. We currently don't have to manage another a goroutine's lifetime, which would require more API. If you stop using the remote key set, it gets garbage collected.

I think different users will have different opinions about unhandled errors. But if we're going to make start making calls on the user's behalf we probably want a way to surface those to the caller. Some will want to log those with different logging packages, others may want to expose those as monitoring metrics.

If this isn't something you're actually hitting I'm inclined to not optimize this. I've got a limited amount of time I can use to maintain open source projects, and try to prioritize things that are actively blocking users.

@ericchiang ericchiang closed this Jul 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants