-
Notifications
You must be signed in to change notification settings - Fork 476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPIRE Server MySQL allowing duplicate entry creations #4329
Comments
We observed it at the following versions: v0.12.3, v1.0.4, v1.6.1 |
This post stands as a living place I will update ad hoc with rates of duplication found over some periods of time. Scale of total registrations is continually in millions, distributed across a couple dozen separate spire clusters: Version 1.6.1: 2023.07.07 - 1 dupe over 2 hours |
I've also observed this today, with PostgreSQL not MySQL. It seems fairly reliably to reproduce: start N instances of spire-server with a side-car that creates the same registration entry. This was with a recent HEAD, from sometime this week. I was able to reproduce the issue twice, so if there's a potential fix for the issue, I can test it to see if helps or not. I should be able to also test MySQL, if that's needed. |
I tested using
but for listing entries we build the query ourselves. I'm no database expert, but I'm not sure locking rows would help us here much with the data being spread out across multiple tables. All inserts affect independent rows with no relation to each other, from the point of view of the database. Setting serializable isolation level seems to work. |
Using serializable isolation level seems like the right direction (also suggested in previous issue #3467), but we will need to do it in a way that is compatible with SQL databases that don't support serializable isolation level, e.g. Percona XtraDB Cluster (see #2587). Some thoughts that come to mind that could help us find a concrete approach:
|
I'll see if that works. I should have time for this later this week. Do we have a list of database engines that we support? I wouldn't have thought of Percona, but I can see why it's supported, it's supposed to look like MySQL. |
This issue is stale because it has been open for 365 days with no activity. |
This issue was closed because it has been inactive for 30 days since being marked as stale. |
In my org we have noticed for a while now that sometimes we have duplicate registrations existing in SPIRE with MySQL DB. At our scale, this is maybe a few dozen duplications a month across 2M+ entries, so not a show-stopping issue but definitely still a bug.
We had some time to investigate deeply today, and we believe we've root-caused it. Credit to @rturner3 as partner in the investigation.
Relevant code: https://github.com/spiffe/spire/blob/727fd183530abc02c8ead0799afccc4d0f667266/pkg/server/datastore/sqlstore/sqlstore.go#L367C2-L367C2
There is a race condition if multiple requesting processes try to make a similar entry in SPIRE.
Normally, a lookup is first done to see if a similar entry (by SPIFFE ID, Parent ID, and Selectors) already exists. If so, we create nothing and just return that entry.
However, since we are using
withWriteTx
, there is not a sufficient lock retrieved for performing the lookup. Thus, concurrent requests can both get back that no similar entry exists, and proceed with creation.We suspect that
withReadModifyWriteTx
is needed instead.My org intends to try to schedule time to test this fix in our internal fork, observing if there's any performance hits and if the issue fully resolves.
However, we're not yet committed to a timeframe here and welcome community members to try to reproduce the issue and test out the fix themselves.
Observed on multiple versions up to our current of 1.6.1. If needed I can attempt to get exact versions :)
The text was updated successfully, but these errors were encountered: