-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate key value violates unique constraint #1719
Comments
Never seen this before, could you set It would also be interesting to see if this happens with the latest version. |
We will set the log level as requested and provide additional logs as soon as it happens again, thank you! |
Is log level |
Yeah, that's a docs issue. |
It happened again today. Following the logs enhanced with debug log. The
|
This is still 1.0.0 right? |
It appears that his is caused by two processes trying to refresh the access token at the same time, and because the transaction is not yet committed, one fails. It is actually intended that this results in an error condition (although this error isn't really a "good" symptomatic) because a refresh token is supposed to be refreshed only once. Otherwise one refresh token could be used to create two (three, four, five) tokens and so on. |
I've scooped around some other projects. The general guideline is to have clients refresh tokens only once. This can be tricky sometimes though, when there is no shared state between the components. It appears that some providers implement a grace period (e.g. 60 seconds) during which the old refresh token returns the same response. I don't think it would solve your problem though, because the requests are apparently only nanoseconds apart - so fast that the first database transaction hasn't even completed yet. I'd suggest you add a random tick (e.g. -10 ±rnd(0,5)s) to your clients to prevent synchronization issues. They can still occur of course, but should be much less common. |
Yes, it's still 1.0.0. |
Noi problem, I'm pretty sure that's what's causing it :) |
Yes, that sounds reasonable. Could using a newer version fix this issue? |
I don't think so. I've also looked into this a bit more. We can't reliably echo the same response twice because that would mean that we have to cache the full valid access token somewhere (e.g. database) which we explicitly have said we won't do. What you could obviously try is reduce the latency between hydra and the DB - 130ms seems a lot to me. |
What I am stumped by is how two requests with the same request ID were proxied to your Hydra nodes? That should definitely not happen. What does your NGINX config look like for your Hydra routes? If I remember correctly, some old versions of NGINX would even retry non-idempotent upstream requests by default. Although I doubt you are on one of those versions. With that said, I noticed you are not injecting UUID V4's which would make it very unlikely you will generate the same ID twice. Finally, double check you LB config to make sure a client cannot add a request ID header to their request. |
@aaslamin great to see you here again :) I checked the code, I think the problem is happening because we use the original request ID to track any refresh/access tokens over time. So what's happening is:
What I'm curios about is that the transaction apparently isn't helping here. Instead of a conflict it should give a 404 (refresh token not found). It might be because the first I'm not sure if there's really a way for us to fix this. If we generated a new request ID, you would end up with two tokens. The only thing we could do is figure out how to return 404 instead of 429 but it would definitely touch some internals, potentially even individually for every database |
This is causing the request ID to cause a 429: https://github.com/ory/fosite/blob/master/handler/oauth2/flow_refresh.go#L159-L160 This is the "fetch from store" I was referring to: https://github.com/ory/fosite/blob/master/handler/oauth2/flow_refresh.go#L135-L140 Actually wondering now if a rollback could cause the old refresh token to come back into the store now... |
Looks like this can be solved by changing the IsolationLevel: https://golang.org/pkg/database/sql/#IsolationLevel Which can be set for transactions: https://golang.org/pkg/database/sql/#TxOptions So basically by picking another isolation level, this wouldn't return 429 but instead 404 (or rather 401/403 as the appropriate error type) |
Yeah I think by setting the isolation level to |
Last comment for now - marking this as a bug :) |
🍻 cheers!
You mean a 409 (Conflict), right? It's currently returning a 5xx error.
I think you are right 🙏 • The default isolation level in MySQL using InnoDB as the storage engine (most common I believe) is Repeatable Read The behavioral difference in this isolation mode is significant:
Source: https://www.postgresql.org/docs/12/transaction-iso.html For a fix, I would love to see:
This brings up another point. Ideally, any first class supported database in Hydra should have full support for the |
Yeah, 409! :) I think Repeatable Read has an acceptable trade-off for refreshing the tokens. Since that operation doesn't happen often for a specific row (which if I understood correctly will be locked, not the entire table). I think the test case makes sense. One refresh token should be enough though. You can probably fire n go routines that try to refresh the token concurrently and you should definitely see the error. Regarding isolation - this really is only required for things where a write is dependent on a previous read. I'm not sure if setting RepeatableRead for every transaction is performant? But then again, the performance impact is probably minimal while the consistency is much better. Maybe |
That's an interesting approach, although currently Perhaps, before Finally, as a best practice, the authorization service that is fosite compliant should check for this key and configure its This way we can have flexibility in isolation levels for various flows instead of going all out w/ |
Yeah, I think that's ok. There's actually so much implicit knowledge already when it comes to the storage implementation that I gave up on imposing that in fosite to be honest. I think setting the appropriate isolation level in begintx in hydra would be a good solution, plus having a failing test to verify the implementation. |
This commit provides the functionality required to address ory/hydra#1719 & ory/hydra#1735 by adding error checking to the RefreshTokenGrantHandler's PopulateTokenEndpointResponse method so it can deal with errors due to concurrent access. This will allow the authorization server to render a better error to the user-agent.
This commit provides the functionality required to address ory/hydra#1719 & ory/hydra#1735 by adding error checking to the RefreshTokenGrantHandler's PopulateTokenEndpointResponse method so it can deal with errors due to concurrent access. This will allow the authorization server to render a better error to the user-agent.
This commit provides the functionality required to address ory/hydra#1719 & ory/hydra#1735 by adding error checking to the RefreshTokenGrantHandler's PopulateTokenEndpointResponse method so it can deal with errors due to concurrent access. This will allow the authorization server to render a better error to the user-agent.
…t handler (#402) This commit provides the functionality required to address ory/hydra#1719 & ory/hydra#1735 by adding error checking to the RefreshTokenGrantHandler's PopulateTokenEndpointResponse method so it can deal with errors due to concurrent access. This will allow the authorization server to render a better error to the user-agent. No longer returns fosite.ErrServerError in the event the storage. Instead a wrapped fosite.ErrNotFound is returned when fetching the refresh token fails due to it no longer being present. This scenario is caused when the user sends two or more request to refresh using the same token and one request gets into the handler just after the prior request finished and successfully committed its transaction. Adds unit test coverage for transaction error handling logic added to the RefreshTokenGrantHandler's PopulateTokenEndpointResponse method
Describe the bug
For our project we are using Hydra and a PostgreSQL database. Sometimes when trying to get an access token calling the /oauth2/token endpoint, Hydra returns an 500 Internal Server Error and logs the following error message:
level=error msg="An error occurred" debug="pq: duplicate key value violates unique constraint \"hydra_oauth2_access_request_id_idx\": Unable to insert or update resource because a resource with that value exists already" description="The authorization server encountered an unexpected condition that prevented it from fulfilling the request" error=server_error
Hydra seems to try to add an access token request to the database with a request id which already exists. What we could observe is that every time this happened multiple (2-3) token requests have been received in a very short time period (<100ms).
Did you experience this at some point or do you have any hint what could be the root cause for it?
We are using an older version (v1.0.0) but I couldn't find anything regarding this in the changelog.
Reproducing the bug
We could not reproduce the bug not even when sending ~500 token requests (almost) simultaneously.
Environment
The text was updated successfully, but these errors were encountered: