This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
psycopg2.errors.UniqueViolation: could not create unique index "receipts_graph_unique_index" when upgrading from <1.68.0 to >=1.70.0 #14406
Labels
A-Background-Updates
Filling in database columns, making the database eventually up-to-date
A-Database
DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db
O-Occasional
Affects or can be seen by some users regularly or most users rarely
S-Minor
Blocks non-critical functionality, workarounds exist.
T-Defect
Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
X-Release-Blocker
Must be resolved before making a release
Comments
Next steps
|
squahtx
added
S-Minor
Blocks non-critical functionality, workarounds exist.
T-Defect
Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
A-Database
DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db
A-Background-Updates
Filling in database columns, making the database eventually up-to-date
O-Occasional
Affects or can be seen by some users regularly or most users rarely
labels
Nov 10, 2022
It's possible I should have done this in #13760. I removed |
squahtx
pushed a commit
that referenced
this issue
Nov 15, 2022
Before creating the `receipts_graph_unique_index` and `receipts_linearized_unique_index` unique indexes, we have to clean up any duplicate receipts that may have crept in due to #14406. Signed-off-by: Sean Quah <seanq@matrix.org>
squahtx
pushed a commit
that referenced
this issue
Nov 15, 2022
Before creating the `receipts_graph_unique_index` and `receipts_linearized_unique_index` unique indexes, we have to clean up any duplicate receipts that may have crept in due to #14406. Signed-off-by: Sean Quah <seanq@matrix.org>
This was referenced Nov 15, 2022
squahtx
pushed a commit
that referenced
this issue
Nov 15, 2022
Before creating the `receipts_graph_unique_index` and `receipts_linearized_unique_index` unique indexes, we have to clean up any duplicate receipts that may have crept in due to #14406. Signed-off-by: Sean Quah <seanq@matrix.org>
squahtx
pushed a commit
that referenced
this issue
Nov 16, 2022
To perform an emulated upsert into a table safely, we must either: * lock the table, * be the only writer upserting into the table * or rely on another unique index being present. When the 2nd or 3rd cases were applicable, we previously avoided locking the table as an optimization. However, as seen in #14406, it is easy to slip up when adding new schema deltas and corrupt the database. Since #13760, Synapse has required SQLite >= 3.27.0, which has support for native upserts. This means that we now only perform emulated upserts while waiting for background updates to add unique indexes. Since emulated upserts are far less frequent now, let's remove the option to skip locking tables, so that we don't shoot ourselves in the foot again. Signed-off-by: Sean Quah <seanq@matrix.org>
squahtx
pushed a commit
that referenced
this issue
Nov 16, 2022
To perform an emulated upsert into a table safely, we must either: * lock the table, * be the only writer upserting into the table * or rely on another unique index being present. When the 2nd or 3rd cases were applicable, we previously avoided locking the table as an optimization. However, as seen in #14406, it is easy to slip up when adding new schema deltas and corrupt the database. Since #13760, Synapse has required SQLite >= 3.27.0, which has support for native upserts. This means that we now only perform emulated upserts while waiting for background updates to add unique indexes. Since emulated upserts are far less frequent now, let's remove the option to skip locking tables, so that we don't shoot ourselves in the foot again. Signed-off-by: Sean Quah <seanq@matrix.org>
squahtx
added a commit
that referenced
this issue
Nov 28, 2022
To perform an emulated upsert into a table safely, we must either: * lock the table, * be the only writer upserting into the table * or rely on another unique index being present. When the 2nd or 3rd cases were applicable, we previously avoided locking the table as an optimization. However, as seen in #14406, it is easy to slip up when adding new schema deltas and corrupt the database. The only time we lock when performing emulated upserts is while waiting for background updates on postgres. On sqlite, we do no locking at all. Let's remove the option to skip locking tables, so that we don't shoot ourselves in the foot again. Signed-off-by: Sean Quah <seanq@matrix.org>
H-Shay
pushed a commit
that referenced
this issue
Dec 13, 2022
To perform an emulated upsert into a table safely, we must either: * lock the table, * be the only writer upserting into the table * or rely on another unique index being present. When the 2nd or 3rd cases were applicable, we previously avoided locking the table as an optimization. However, as seen in #14406, it is easy to slip up when adding new schema deltas and corrupt the database. The only time we lock when performing emulated upserts is while waiting for background updates on postgres. On sqlite, we do no locking at all. Let's remove the option to skip locking tables, so that we don't shoot ourselves in the foot again. Signed-off-by: Sean Quah <seanq@matrix.org>
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
A-Background-Updates
Filling in database columns, making the database eventually up-to-date
A-Database
DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db
O-Occasional
Affects or can be seen by some users regularly or most users rarely
S-Minor
Blocks non-critical functionality, workarounds exist.
T-Defect
Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
X-Release-Blocker
Must be resolved before making a release
Seen as #14123
When upgrading Synapse from a version older than 1.68.0, the
receipts_graph_unique_index
background update may fail withThe series of schema deltas involved are:
21/receipts.sql
receipts_graph(room_id, receipt_type, user_id, event_ids, data)
table withCONSTRAINT receipts_graph_uniqueness UNIQUE (room_id, receipt_type, user_id)
ie. there is one receipt for each user for each room.
72/07thread_receipts.sql.postgres
(1.68.0)thread_id
column toreceipts_graph
CONSTRAINT receipts_graph_uniqueness_thread UNIQUE (room_id, receipt_type, user_id, thread_id);
ie. there is one receipt for each user for each thread in a room.
(room_id, receipt_type, user_id)
constraint is too restrictive.72/08thread_receipts.sql
(1.68.0)receipts_graph_unique_index
background update, which adds a(room_id, receipt_type, user_id)
constraintWHERE thread_id IS NULL
.ie. there is one non-thread receipt for each user for each room.
73/08thread_receipts_non_null.sql.postgres
(1.70.0)receipts_graph_uniqueness
constraint, allowing thread receipts to work.sqlite takes a similar, equally confusing path.
The window where there is no unique constraint
Since background updates are run asynchronously, the
receipts_graph_unique_index
background update may run after the last schema delta, leaving a window where there is no unique constraint on(room_id, receipt_type, user_id)
forNULL
thread_id
s.Unsafe upserts
But that isn't the bug. We have logic to deal with this window. See
synapse/synapse/storage/database.py
Lines 90 to 100 in c3a4780
When one of these background updates is in progress, all our
simple_upsert*
operations are done manually without relying on unique constraints.And we don't upsert into
receipts_graph
with handwritten SQL anywhere.Emulated upsert internals
The emulated upsert first tries an
UPDATE
, then anINSERT
if theUPDATE
modified 0 rows.The default isolation level in Synapse is REPEATABLE READ, which does not prevent the race where two upserts try to insert the same row at the same time.
But we've already thought of this and lock the entire table when doing the emulated upsert:
synapse/synapse/storage/database.py
Lines 1301 to 1302 in c3a4780
Except the locking is controlled by a parameter... and we've left it as
False
:synapse/synapse/storage/databases/main/receipts.py
Lines 856 to 868 in 7d59a51
In summary, there's a window where there is no non-thread unique constraint on
receipts_graph
and a race where we try to insert two new rows at the same time for the same(room_id, receipt_type, user_id)
.The same probably applies to
receipts_linearized
.The text was updated successfully, but these errors were encountered: