Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPIRE Server fails to connect Percona XtraDB Cluster(PXC) #2587

Closed
hiyosi opened this issue Oct 6, 2021 · 6 comments · Fixed by #2605
Closed

SPIRE Server fails to connect Percona XtraDB Cluster(PXC) #2587

hiyosi opened this issue Oct 6, 2021 · 6 comments · Fixed by #2605

Comments

@hiyosi
Copy link
Contributor

hiyosi commented Oct 6, 2021

  • Version: 1.0.2
  • Platform: x86_64 GNU/Linux
  • Subsystem: server, datastore

When using PXC(Server version: 5.7.33-36-57-log) as the datastore, SPIRE Server fails to connect the datastore and can not start.

time="2021-10-05T13:29:04+09:00" level=error msg="Unable to rotate X509 CA" error="datastore-sql: Error 1105: Percona-XtraDB-Cluster doesn't recommend using SERIALIZABLE isolation with pxc_strict_mode = ENFORCING" 

cf. https://www.percona.com/doc/percona-xtradb-cluster/LATEST/features/pxc-strict-mode.html#pxc-strict-mode

rel: #2150

@hiyosi
Copy link
Contributor Author

hiyosi commented Oct 7, 2021

If set strict_mode=PERMISSIVE, SPIRE Server starts successfully.

but, the doc mentions

It is recommended to keep PXC Strict Mode set to ENFORCING, because in this case whenever Percona XtraDB Cluster encounters a tech preview feature or an unsupported operation, the server will deny it. This will force you to re-evaluate your Percona XtraDB Cluster configuration without risking the consistency of your data.

@hiyosi
Copy link
Contributor Author

hiyosi commented Oct 11, 2021

We want to use the recommended configuration as far as possible.

azdagron added a commit to azdagron/spire that referenced this issue Oct 26, 2021
We recently transitioned our read-modify-write operations to use the
SERIALIZABLE isolation level for MySQL due to MySQL's weaker guarantees
(relative to PostgreSQL and SQLite3) for the REPEATABLE READ isolation
level, which was causing data loss.

However, doing so broke Percona XtraDB Cluster which does not support
explicit row-locking (including the SERIALIZABLE isolation level) except
experimentally (have to turn of "strict" mode).

This change restores the isolation level for read-modify-write
operations to use REPEATABLE READ but changes the "read"'s to do a
SELECT..FOR UPDATE, which provides the implicit row-level locking we
require for to protect read-modify-write transactions from data loss.

This is only done for MySQL even though PostreSQL supports it since the
isolation level is sufficient.

Fixes: spiffe#2587

Signed-off-by: Andrew Harding <aharding@vmware.com>
@azdagron
Copy link
Member

As far as I can tell "SELECT ... FOR UPDATE" is supported and is a viable alternative to SERIALIZABLE. I've got a fix in place #2605. I tested it against a Percona XtraDB cluster container but it would be good to have somebody else confirm!

@hiyosi
Copy link
Contributor Author

hiyosi commented Oct 27, 2021

Thank you very mush for fixing it.
I'll confirm the fix as soon as possible!

azdagron added a commit that referenced this issue Oct 27, 2021
We recently transitioned our read-modify-write operations to use the
SERIALIZABLE isolation level for MySQL due to MySQL's weaker guarantees
(relative to PostgreSQL and SQLite3) for the REPEATABLE READ isolation
level, which was causing data loss.

However, doing so broke Percona XtraDB Cluster which does not support
explicit row-locking (including the SERIALIZABLE isolation level) except
experimentally (have to turn of "strict" mode).

This change restores the isolation level for read-modify-write
operations to use REPEATABLE READ but changes the "read"'s to do a
SELECT..FOR UPDATE, which provides the implicit row-level locking we
require for to protect read-modify-write transactions from data loss.

This is only done for MySQL even though PostreSQL supports it since the
isolation level is sufficient.

Fixes: #2587

Signed-off-by: Andrew Harding <aharding@vmware.com>
@hiyosi
Copy link
Contributor Author

hiyosi commented Oct 28, 2021

@azdagron
I've confirmed that the spire-server runs correctly in my environment as well! Thanks for the great fix.

@azdagron
Copy link
Member

Awesome! Thank you for confirming, @hiyosi!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants