Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Silent mode on 1.19.0+ and MySQL may fail #849

Closed
robotdan opened this issue Sep 5, 2020 · 18 comments
Closed

Silent mode on 1.19.0+ and MySQL may fail #849

robotdan opened this issue Sep 5, 2020 · 18 comments
Assignees
Labels
bug Something isn't working superseded

Comments

@robotdan
Copy link
Member

robotdan commented Sep 5, 2020

Silent mode on 1.19.0+ and MySQL may fail

Description

It is unclear if this is a bug in FusionAuth or the db configuration.But a user expererienced the following after updating to 1.19.1.

java.lang.IllegalStateException: Unable to capture database lock. This indicates that the database either doesn't support locks or is misconfigured

In version >= 1.19.0 we support silent mode db upgrades even when using runtime mode production. This allows FusionAuth optionally handle the db schema updates even in a production multi-node environment.

In MySQL we do this with a db lock, obtained using the following statement:

SELECT GET_LOCK('fusionauth-maintenance-mode', -1);

This MySQL function has been around at least since 5.6, I didn't look at older documentation. In this case the user is on version 8.0.21.

https://dev.mysql.com/doc/refman/8.0/en/locking-functions.html#function_get-lock

We throw the IllegalStateException because the SELECT statement did not return a result, or the result was not equal to 1.

Affects versions

1.19.0+

Steps to reproduce

Using MySQL

Update from pre 1.19.0 to 1.19.0 or 1.19.1 using silent mode in runtime mode for development or production. If in production runtime mode you'll need to also explicitly enable silent mode by using environment variable FUSIONAUTH_APP.SILENT_MODE=true or adding fusionaut-app.silent-mode=true to your fusionauth.properties file.

Expected behavior

No error, for the db lock to work.

Platform

MySQL version 8.0.21

Additional context

Add any other context about the problem here.

@robotdan robotdan added the bug Something isn't working label Sep 5, 2020
@markschmid
Copy link

markschmid commented Sep 6, 2020

Here's just a little more context:

FA and the database are both run as Docker containers.

If I manually connect to the fusionauth database using the same credentials used in the docker compose files (DATABASE_USERNAME and DATABASE_PASSWORD) and run SELECT GET_LOCK('fusionauth-maintenance-mode', -1);, I get 1 as a response. If I run SELECT RELEASE_LOCK('fusionauth-maintenance-mode');, I get 1 as well. If I run the release command again, I get NULL. So this locking business seems to work in general, on this database instance.

@robotdan
Copy link
Member Author

robotdan commented Sep 8, 2020

@markschmid is this a single node configuration of FusionAuth?

@markschmid
Copy link

@robotdan Yes, it's a single node configuration on Docker. These are the env vars that are used:

    environment:
      DATABASE_URL: jdbc:mysql://database:3306/fusionauth?serverTimezone=CET
      DATABASE_ROOT_USER: root
      DATABASE_ROOT_PASSWORD: <rootpw>
      DATABASE_USERNAME: fusionauth
      DATABASE_PASSWORD: <fusionauthpw>
      FUSIONAUTH_APP_MEMORY: 256M
      FUSIONAUTH_APP_URL: http://fusionauth:9011
      FUSIONAUTH_APP_RUNTIME_MODE: development
      SEARCH_TYPE: database

@robotdan
Copy link
Member Author

robotdan commented Sep 12, 2020

I think this is likely the same issue as #857.

If it is, the lock failure is a side effect of the driver not being loaded in time.

Here is the stack trace from issue #857

2020-09-12 9:58:56.701 AM ERROR com.inversoft.maintenance.db.DatabaseSilentConfigurationWorkflowTask - Encountered an error while running silent mode
java.lang.IllegalStateException: Unable to capture database lock. This indicates that the database either doesn't support locks or is misconfigured.
        at com.inversoft.maintenance.db.JDBCMaintenanceModeDatabaseService.lockDatabase(JDBCMaintenanceModeDatabaseService.java:284)
        at com.inversoft.maintenance.db.DatabaseSilentConfigurationWorkflowTask.perform(DatabaseSilentConfigurationWorkflowTask.java:43)
        at com.inversoft.maintenance.DefaultMaintenanceModeSilentConfigurationWorkflow.perform(DefaultMaintenanceModeSilentConfigurationWorkflow.java:40)
        at com.inversoft.maintenance.servlet.MaintenanceModePrimeServletContextListener.contextInitialized(MaintenanceModePrimeServletContextListener.java:53)
        at io.fusionauth.app.primeframework.FusionAuthAppPrimeServletContextListener.contextInitialized(FusionAuthAppPrimeServletContextListener.java:28)
        at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4701)
        at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5167)
        at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
        at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1412)
        at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1402)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
        at java.base/java.lang.Thread.run(Thread.java:832)
Caused by: java.sql.SQLException: No suitable driver found for jdbc:mysql://something.ap-southeast-1.rds.amazonaws.com:3306/mysql
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:702)
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:228)
        at com.inversoft.maintenance.db.JDBCMaintenanceModeDatabaseService.lockDatabase(JDBCMaintenanceModeDatabaseService.java:267)
        ... 13 common frames omitted

I was able to recreate this one, and there is a case where we attempt to lock the database before we have downloaded the MySQL driver. So the root cause is the driver is not found.

@robotdan
Copy link
Member Author

Closing for now, I think this is the same root issue as #857 . If we encounter it again, we can re-open.

@markschmid
Copy link

I can confirm: it now works. Upgraded from 1.18.8 to 1.19.6 without any hassle. Thanks.

@acatalina
Copy link

acatalina commented Oct 19, 2020

hey @robotdan, sorry to comment on a closed issue but just triple checking... I am having the same issue (Unable to capture database lock) on 1.19.8, the only difference is that I am trying to connect to a postgresql instance. Just wondering if it is possible that the fix was only for mysql? No idea really how it was fixed in the end. Thanks!

@robotdan
Copy link
Member Author

@acatalina not sure, please open a new issue and provide any debug information you have. Thanks!

@acatalina
Copy link

Oh, cool, cheers. I'll investigate a little bit further and create an issue. Thanks!

@whizzygeeks
Copy link

Facing same issue with Postgresql as JDBC connection

2021-11-25 12:12:55.513 PM ERROR com.inversoft.maintenance.db.DatabaseSilentModeWorkflowTask - Encountered an error while running silent mode
java.lang.IllegalStateException: Unable to capture database lock. This indicates that the database either doesn't support locks or is misconfigured.
at com.inversoft.maintenance.db.JDBCMaintenanceModeDatabaseService.lockDatabase(JDBCMaintenanceModeDatabaseService.java:302)
at com.inversoft.maintenance.db.DatabaseSilentModeWorkflowTask.perform(DatabaseSilentModeWorkflowTask.java:43)
at com.inversoft.maintenance.DefaultMaintenanceModeWorkflow.performSilentConfiguration(DefaultMaintenanceModeWorkflow.java:47)
at com.inversoft.maintenance.servlet.MaintenanceModePrimeServletContextListener.contextInitialized(MaintenanceModePrimeServletContextListener.java:52)
at io.fusionauth.app.primeframework.FusionAuthAppPrimeServletContextListener.contextInitialized(FusionAuthAppPrimeServletContextListener.java:26)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4705)
at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5171)
at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:183)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1412)
at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1402)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
at java.base/java.lang.Thread.run(Thread.java:832)
Caused by: org.postgresql.util.PSQLException: FATAL: password authentication failed for user "postgres"
at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:613)
at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:161)
at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:213)
at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:51)
at org.postgresql.jdbc.PgConnection.(PgConnection.java:223)
at org.postgresql.Driver.makeConnection(Driver.java:465)
at org.postgresql.Driver.connect(Driver.java:264)
at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:677)
at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:228)
at com.inversoft.maintenance.db.JDBCMaintenanceModeDatabaseService.lockDatabase(JDBCMaintenanceModeDatabaseService.java:284)
... 13 common frames omitted

@robotdan
Copy link
Member Author

Since we closed this one out and it was originally for MySQL, @whizzygeeks can you open a new issue please.

Then add as many details as possible, such as what version of Postgres, what platform (Linux, Windows, macOS, AWS, Azure, etc), more details the better. Thanks!

@Mistic92
Copy link

Mistic92 commented Dec 7, 2021

I have the same issue using latest docker image and trying to connect to PostgreSQL 12 on GCP Cloud SQL

@robotdan
Copy link
Member Author

robotdan commented Dec 7, 2021

Thanks for the update @Mistic92 , re-opening.

Still not sure how to debug this one though. We are essentially using the same strategy that Google Flyway (now managed by RedGate it appears) uses for db cross node locking. Perhaps we need to see if they are any fixes or changes in their strategy since we first built this.

https://github.com/flyway/flyway/

It think this is the PostgreSQL locking code in FlyWay.
https://github.com/flyway/flyway/blob/master/flyway-core/src/main/java/org/flywaydb/core/internal/database/postgresql/PostgreSQLAdvisoryLockTemplate.java

Here is essentially what we are doing in MySQL and PostgreSQL respectively.

// MySQL
SELECT GET_LOCK('fusionauth-maintenance-mode', -1);
// PosgreSQL
SELECT PG_ADVISORY_LOCK(9833609);

If we take an exception in either of these cases, we'll throw this:

throw new IllegalStateException("Unable to capture database lock. This indicates that the database either doesn't support locks or is misconfigured.", e);

@voidmain
Copy link
Member

voidmain commented Dec 9, 2021

The stack trade from @whizzygeeks looks like a different issue. It has this line:

Caused by: org.postgresql.util.PSQLException: FATAL: password authentication failed for user "postgres"

This implies that the password for the "root" user is invalid. If you have defined a "root" user, then we will use that when attempting to capture the lock of the entire database. If you don't define a "root" user, then we will use the "ordinary" user to attempt to lock the database.

Can you all provide snippets of your configuration files and environment variables specifically around the database configuration? This will help us identify if the issue is configuration or implementation.

@endyrocket
Copy link

Facing the same issue as @whizzygeeks. Got this exception during upgrade from 1.30.2 to 1.32.1. Rolling back to 1.30.2 fix the issue.

@robotdan
Copy link
Member Author

@endyrocket can you provide the stack trace of you have it, the version of Postgres and where it is running (local, GCP, RDS, etc)

Thanks!

@KCJwhite
Copy link

KCJwhite commented Jun 17, 2022

Moved to new issue: #1762

Linking the 2 since many people seem to be having the same problem.

@bwang221
Copy link

This still happens, Postgres 14.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working superseded
Projects
None yet
Development

No branches or pull requests

9 participants