-
Notifications
You must be signed in to change notification settings - Fork 136
PostgresqlAndCollation
With drop of support for RHEL7 we had to upgrade our DB hosts to Alma9. Here I share notes on issues we (Fermilab) have run into.
Fermilab runs 3 separate dCache instances (2 CMS and 1 "public" - multu-VO instance. Public being the largest). Prior to June 30, 2024 all databases were running on SL7 host, postgresql version 11. The database hosts had to be upgraded to Alma9 with simultaneous upgrade of postgresql to version 15. In some cases switching to new hardware.
The current host running chimera DB is slated to be replaced with new hardware. To minimize downtime the following plan was executed:
- short downtime to run
pg_upgrade
from 11 to 15 - setup stream replication to new Alma9 host
- one day just promote replica on new Alma9 host to be the master
This worked seemingly well, but on attempt to connect to replica db the following warning was seen:
# psql -U postgres chimera
WARNING: database "chimera" has a collation version mismatch
DETAIL: The database was created using collation version 2.17, but the operating system provides version 2.34.
HINT: Rebuild all objects in this database that use the default collation and run ALTER DATABASE chimera REFRESH COLLATION VERSION, or build PostgreSQL with the right library version.
psql (15.7)
Type "help" for help.
chimera=#
The replication seem to be working (as evidenced by comparing master and replica and observing data catching up). The issue is due to change in glibc collation between EL7 and EL8 (and possibly 9). The problem is summarized here:
https://wiki.postgresql.org/wiki/Locale_data_changes
Apparently it affects indexed string columns.
The same issue will be encountered if you just upgrade your DB host from SL7 to EL{8,9}.
So, there are three ways of handling postgresql when upgrading OS:
- do dump before OS upgrade / restore after OS upgrade (may take a long time).
- REINDEX the indexes after running OS upgrade. This is faster than dump/restore.
- use logical replication
We used the following script to reindex indexes:
#!/bin/bash
QUERY="SELECT DISTINCT \
indrelid::regclass::text, \
indexrelid::regclass::text, \
collname, pg_get_indexdef(indexrelid) \
FROM (SELECT indexrelid, indrelid, indcollation[i] coll \
FROM pg_index, generate_subscripts(indcollation, 1) g(i)) s \
JOIN pg_collation c ON coll=c.oid \
WHERE collprovider IN ('d', 'c') AND collname NOT IN ('C', 'POSIX')"
psql -t -F ' ' -A -U postgres chimera -c "${QUERY}" 2>/dev/null | while read table index collname ref
do
t0=`date +"%s"`
psql -t -F ' ' -A -U postgres chimera -c "REINDEX INDEX ${index}" 2>/dev/null
t1=`date +"%s"`
dt=$((t1-t0))
echo "${db} ${index} Took ${dt} seconds"
psql -t -F ' ' -A -U postgres chimera -c "ALTER DATABASE $db REFRESH COLLATION VERSION"
done
The indexes on chimera DB that need to be reindexed:
chimera=# SELECT DISTINCT indexrelid::regclass::text as name, collname, pg_get_indexdef(indexrelid)
FROM (SELECT indexrelid, indrelid, indcollation[i] coll FROM pg_index, generate_subscripts(indcollation, 1) g(i)) s
JOIN pg_collation c ON coll=c.oid
WHERE collprovider IN ('d', 'c') AND collname NOT IN ('C', 'POSIX');
name | collname | pg_get_indexdef
--------------------------------------+----------+----------------------------------------------------------------------------------------------------------------------
i_locationinfo_trash_itype_ilocation | default | CREATE INDEX i_locationinfo_trash_itype_ilocation ON public.t_locationinfo_trash USING btree (itype, ilocation)
i_xattr_pkey | default | CREATE UNIQUE INDEX i_xattr_pkey ON public.t_xattr USING btree (inumber, ikey)
pk_databasechangelog | default | CREATE UNIQUE INDEX pk_databasechangelog ON public.databasechangelog USING btree (id, author, filename)
t_dirs_pkey | default | CREATE UNIQUE INDEX t_dirs_pkey ON public.t_dirs USING btree (iparent, iname)
t_inodes_pkey | default | CREATE UNIQUE INDEX t_inodes_pkey ON public.t_inodes USING btree (ipnfsid)
t_labels_labelname_key | default | CREATE UNIQUE INDEX t_labels_labelname_key ON public.t_labels USING btree (labelname)
t_locationinfo_pkey | default | CREATE UNIQUE INDEX t_locationinfo_pkey ON public.t_locationinfo USING btree (inumber, itype, ilocation)
t_locationinfo_trash_pkey | default | CREATE UNIQUE INDEX t_locationinfo_trash_pkey ON public.t_locationinfo_trash USING btree (ipnfsid, itype, ilocation)
t_tags_pkey | default | CREATE UNIQUE INDEX t_tags_pkey ON public.t_tags USING btree (inumber, itagname)
(9 rows)
These indexes can be reindexed in parallel. In our case (1.2B entries in t_inodes
table running REINDEX in parallel took about 2 hours.