Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Warning about incorrect database collation/ctype should have a link on how to fix it #6831

Closed
turt2live opened this issue Feb 3, 2020 · 11 comments

Comments

@turt2live
Copy link
Member

[homeserver_1] 2020-02-03 12:13:06,654 - synapse.storage.engines.postgres - 65 - WARNING - None- Database has incorrect collation of 'en_US.UTF-8'. Should be 'C'
[homeserver_1] 2020-02-03 12:13:06,655 - synapse.storage.engines.postgres - 70 - WARNING - None- Database has incorrect ctype of 'en_US.UTF-8'. Should be 'C'                        

These unfortunately don't say how I should be fixing it :(

@richvdh
Copy link
Member

richvdh commented Feb 3, 2020

@udaybansal19
Copy link
Contributor

Can i work on this?

@richvdh
Copy link
Member

richvdh commented Feb 24, 2020

If you like, sure.

@turt2live
Copy link
Member Author

I'm sure you read https://github.com/matrix-org/synapse/blob/develop/UPGRADE.rst#upgrading-to-v1100 carefully?

Not carefully, but did skim it and thought it would be fine. Updated Synapse, restarted, got the message, opened the issue. A bit of goldfish brain ._.

https://github.com/matrix-org/synapse/blob/develop/docs/postgres.md#fixing-incorrect-collate-or-ctype was the thing I was looking for

@udaybansal19
Copy link
Contributor

Should i add https://github.com/matrix-org/synapse/blob/develop/docs/postgres.md#fixing-incorrect-collate-or-ctype to the warning. I have just added docs/postgres.md
logger.warning( "Database has incorrect collation of %r. Should be 'C'\n" "See docs/postgres.md for more information.", collation )

@jakehemmerle
Copy link

Looking for first issues. I assume this should be closed?

@turt2live
Copy link
Member Author

Looks like it, yea

@r4dh4l
Copy link

r4dh4l commented Jul 8, 2022

Sorry for bumping but upgrading an 1.47.1 installation I stumbled over the collation/ctype problem.

Unfortunately I'm not enough advanced to solve the problem with the information provided by https://matrix-org.github.io/synapse/latest/postgres.html#fixing-incorrect-collate-or-ctype ("dump the database and recreate it with the correct locale parameter (as shown above)").
Could someone provide an example like the one provided in https://matrix-org.github.io/synapse/latest/upgrade#adding-a-new-unique-index-to-state_group_edges-could-fail-if-your-database-is-corrupted (-> "Expand this section for steps to resolve this problem")?

@DMRobertson
Copy link
Contributor

I don't have an explicit command for you, but here are some clues:

  1. Recreate it with the correct locale: it's referring to the instructions here https://matrix-org.github.io/synapse/latest/postgres.html#set-up-database
  2. There is a good description of dumping and restoring a database here: https://www.postgresql.org/docs/current/backup-dump.html
  3. There's also https://www.postgresql.org/docs/current/app-pgdump.html for details on dumping the database.

I'd suggest:

  • dump the database to a plain text sql file using pg_dump
  • make a new database called synapse2 following the instructions from (1) above.
  • feed the dump file to psql, pointing at the database synapse2.

For more advice, I think your best bet would be the #synapse-admins:matrix.org room.

@r4dh4l
Copy link

r4dh4l commented Aug 7, 2022

Here is a "walkthrough" for others who had to do such a fix for the first time. It is not exactly what @DMRobertson suggested but it worked for me:

How to fix incorrect database collation/ctype for upgrade to Synapse 1.56.0

Check the Synapse database status

Example for right collation/ctype

root@vm-synapse:~$ su - postgres
Password: 
postgres@vm-synapse:~$ psql --dbname=synapse --username=synapse
Password for user synapse: 
psql (11.16 (Debian 11.16-0+deb10u1))
Type "help" for help.

synapse=> SELECT datcollate AS collation FROM pg_database WHERE datname = current_database();
 collation 
-----------
 C
(1 row)

synapse=> \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 synapse   | synapse  | UTF8     | C           | C           | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

synapse=> 

Because the SQL query lists "C" for column "Collate" and "Ctype" in the row "synapse" there is nothing to fix!

Example for wrong collation/ctype

root@vm-synapse:~$ su - postgres
Password: 
postgres@vm-synapse:~$ psql --dbname=synapse --username=synapse
Password for user synapse: 
psql (11.16 (Debian 11.16-0+deb10u1))
Type "help" for help.

synapse=> SELECT datcollate AS collation FROM pg_database WHERE datname = current_database();
  collation  
-------------
 en_US.UTF-8
(1 row)

synapse=> \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 synapse   | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +
           |          |          |             |             | postgres=CTc/postgres+
           |          |          |             |             | synapse=CTc/postgres
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

synapse=> 

Because the SQL query lists not "C" for "Collate" and column "Ctype" in the row "synapse" you need to recreate the database with correct C type.

Fix the Synapse database

Stop Synapse

systemctl stop synapse.service

systemctl status synapse.service

(and other processes accessing the Synapse database)

Backup the Synapse database

Check size of the Synapse databse:

su - postgres
psql --dbname=synapse --username=synapse
synapse=> \l+ synapse
                                                   List of databases
  Name   |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace | Description 
---------+----------+----------+-------------+-------------+-----------------------+---------+------------+-------------
 synapse | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +| 3799 MB | pg_default | 
         |          |          |             |             | postgres=CTc/postgres+|         |            | 
         |          |          |             |             | synapse=CTc/postgres  |         |            | 
(1 row)

synapse=> 

As root acter confirming you have enough free disk space in the backup directory:

df -h

pg_dump -U postgres synapse > "/path/to/backupdir/synapse-db.dump2022-08-07.sql"

Delete the Synapse database

As user postgres:

su - postgres
psql -l
dropdb synapse;
psql -l

Example:

postgres@vm-synapse:~$ psql -l
Password for user postgres: 
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 synapse   | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres         +
           |          |          |             |             | postgres=CTc/postgres+
           |          |          |             |             | synapse=CTc/postgres
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

postgres@vm-synapse:~$ dropdb synapse;
Password: 
postgres@vm-synapse:~$ psql -l
Password for user postgres: 
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(3 rows)

postgres@vm-synapse:~$ 

Recreate the Synapse database

As user postgres:

createdb --encoding=UTF8 --locale=C --template=template0 --owner=synapse synapse

psql -l

Example:

postgres@vm-synapse:~$ createdb --encoding=UTF8 --locale=C --template=template0 --owner=synapse synapse
Password: 
postgres@vm-synapse:~$ psql -l
Password for user postgres: 
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | 
 synapse   | synapse  | UTF8     | C           | C           | 
 template0 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

postgres@vm-synapse:~$ 

Import the dump of the old Synapse database to the new created synapse database

As user postgres:

psql --dbname=synapse --username=synapse < "/path/to/backupdir/synapse-db.dump2022-08-07.sql"

psql --dbname=synapse --username=synapse

synapse=> \l+ synapse

\q

Example:

postgres@vm-synapse:~$ psql --dbname=synapse --username=synapse < "/path/to/backupdir/synapse-db.dump2022-08-07.sql"
...
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
ALTER TABLE
postgres@vm-synapse:~$ psql --dbname=synapse --username=synapse
Password for user synapse: 
psql (11.16 (Debian 11.16-0+deb10u1))
Type "help" for help.

synapse=> \l+ synapse
                                            List of databases
  Name   |  Owner  | Encoding | Collate | Ctype | Access privileges |  Size   | Tablespace | Description 
---------+---------+----------+---------+-------+-------------------+---------+------------+-------------
 synapse | synapse | UTF8     | C       | C     |                   | 2943 MB | pg_default | 
(1 row)

synapse=> \q
postgres@vm-synapse:~$ 

The sizes listed by psql were different in my case but comparing a dump of the new databse with the former one showed the same size so I think everything went well.

@lamoboos223
Copy link

you should create the database with the proper definition

psql -U postgres
CREATE DATABASE synapse WITH LC_CTYPE = 'C' LC_COLLATE='C' TEMPLATE='template0';

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants