Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

[sqlite] can't launch after upgrade due to failing database migration #14333

Closed
fuji246 opened this issue Oct 31, 2022 · 9 comments
Closed

[sqlite] can't launch after upgrade due to failing database migration #14333

fuji246 opened this issue Oct 31, 2022 · 9 comments
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@fuji246
Copy link

fuji246 commented Oct 31, 2022

Description

can't launch after upgrade, looks like db migration issue: table receipts_linearized_new already exists. I've tried to remove that table but still having the same error.

Steps to reproduce

  1. pip install --upgrade matrix-synapse
  2. `synctl start --no-daemonize

Homeserver

matrix.lomorage.com

Synapse Version

$ curl http://localhost:8008/_synapse/admin/v1/server_version {"server_version":"1.37.1","python_version":"3.7.3"}

Installation Method

pip (from PyPI)

Platform

arm64

Relevant log output

Server log below:

2022-10-30 23:27:06,063 - root - 345 - WARNING - main - ***** STARTING SERVER *****
2022-10-30 23:27:06,063 - root - 349 - WARNING - main - Server /home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/app/homeserver.py version 1.70.1
2022-10-30 23:27:06,064 - root - 351 - INFO - main - Server hostname: cloud2.lomorage.com
2022-10-30 23:27:06,064 - root - 352 - INFO - main - Instance name: master
2022-10-30 23:27:06,064 - root - 353 - INFO - main - Twisted reactor: EPollReactor
2022-10-30 23:27:06,064 - synapse.app.homeserver - 372 - INFO - main - Setting up server
2022-10-30 23:27:06,065 - synapse.server - 307 - INFO - main - Setting up.
2022-10-30 23:27:06,066 - synapse.storage.databases - 66 - INFO - main - [database config 'master']: Checking database server
2022-10-30 23:27:06,066 - synapse.storage.databases - 72 - INFO - main - [database config 'master']: Preparing for databases ['main', 'state']
2022-10-30 23:27:06,067 - synapse.storage.prepare_database - 115 - INFO - main - ['main', 'state']: Checking existing schema version
2022-10-30 23:27:06,081 - synapse.storage.prepare_database - 123 - INFO - main - ['main', 'state']: Existing schema is 73 (+9 deltas)
2022-10-30 23:27:06,082 - synapse.storage.databases.main - 311 - INFO - main - Checking database for consistency with configuration...
2022-10-30 23:27:06,084 - synapse.storage.prepare_database - 411 - INFO - main - Applying schema deltas for v73
2022-10-30 23:27:06,090 - synapse.storage.prepare_database - 527 - INFO - main - Applying engine-specific schema 73/08thread_receipts_non_null.sql.sqlite
2022-10-30 23:27:07,473 - synapse.app._base - 205 - ERROR - main - Exception during startup
Traceback (most recent call last):
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/app/homeserver.py", line 375, in setup
    hs.setup()
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/server.py", line 309, in setup
    self.datastores = Databases(self.DATASTORE_CLASS, self)
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/databases/__init__.py", line 78, in __init__
    databases=database_config.databases,
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/prepare_database.py", line 141, in prepare_database
    databases=databases,
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/prepare_database.py", line 528, in _upgrade_existing_database
    database_engine.execute_script_file(cur, absolute_path)
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/engines/_base.py", line 145, in execute_script_file
    cls.executescript(cursor, f.read())
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/engines/sqlite.py", line 140, in executescript
    cursor.executescript(script)
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py", line 395, in executescript
    self._do_execute(self.txn.executescript, sql)  # type: ignore[attr-defined]
  File "/home/lomoware/synapse/env/lib/python3.7/site-packages/synapse/storage/database.py", line 436, in _do_execute
    return func(sql, *args, **kwargs)
sqlite3.OperationalError: table receipts_linearized_new already exists


### Anything else that would be useful to know?

_No response_
@DMRobertson DMRobertson added S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Oct 31, 2022
@reivilibre
Copy link
Contributor

How did you try to remove the table?

Please can you give the output of

SELECT * FROM applied_schema_deltas;

?

@fuji246
Copy link
Author

fuji246 commented Oct 31, 2022

I just deleted the table in "homeserver.db".

Here is output:

55|55/access_token_expiry.sql
55|55/track_threepid_validations.sql
55|55/users_alter_deactivated.sql
56|56/add_spans_to_device_lists.sql
56|56/current_state_events_membership.sql
56|56/current_state_events_membership_mk2.sql
56|56/delete_keys_from_deleted_backups.sql
56|56/destinations_failure_ts.sql
56|56/device_stream_id_insert.sql
56|56/devices_last_seen.sql
56|56/drop_unused_event_tables.sql
56|56/event_expiry.sql
56|56/event_labels.sql
56|56/event_labels_background_update.sql
56|56/fix_room_keys_index.sql
56|56/hidden_devices.sql
56|56/hidden_devices_fix.sql.sqlite
56|56/nuke_empty_communities_from_db.sql
56|56/public_room_list_idx.sql
56|56/redaction_censor.sql
56|56/redaction_censor2.sql
56|56/redaction_censor4.sql
56|56/remove_tombstoned_rooms_from_directory.sql
56|56/room_key_etag.sql
56|56/room_membership_idx.sql
56|56/room_retention.sql
56|56/signing_keys.sql
56|56/signing_keys_nonunique_signatures.sql
56|56/state_group_room_idx.sql
56|56/stats_separated.sql
56|56/unique_user_filter_index.py
56|56/user_external_ids.sql
56|56/users_in_public_rooms_idx.sql
57|57/delete_old_current_state_events.sql
57|57/device_list_remote_cache_stale.sql
57|57/local_current_membership.py
57|57/remove_sent_outbound_pokes.sql
57|57/rooms_version_column.sql
57|57/rooms_version_column_2.sql.sqlite
57|57/rooms_version_column_3.sql.sqlite
58|58/00background_update_ordering.sql
58|58/02remove_dup_outbound_pokes.sql
58|58/03persist_ui_auth.sql
58|58/06dlols_unique_idx.py
58|58/07add_method_to_thumbnail_constraint.sql.sqlite
58|58/07persist_ui_auth_ips.sql
58|58/08_media_safe_from_quarantine.sql.sqlite
58|58/09shadow_ban.sql
58|58/10_pushrules_enabled_delete_obsolete.sql
58|58/10drop_local_rejections_stream.sql
58|58/10federation_pos_instance_name.sql
58|58/11dehydration.sql
58|58/11fallback.sql
58|58/11user_id_seq.py
58|58/12room_stats.sql
58|58/13remove_presence_allow_inbound.sql
58|58/14events_instance_name.sql
58|58/15_catchup_destination_rooms.sql
58|58/15unread_count.sql
58|58/16populate_stats_process_rooms_fix.sql
58|58/17_catchup_last_successful.sql
58|58/18stream_positions.sql
58|58/19txn_id.sql
58|58/20instance_name_event_tables.sql
58|58/20user_daily_visits.sql
58|58/21as_device_stream.sql
58|58/21drop_device_max_stream_id.sql
58|58/22puppet_token.sql
58|58/22users_have_local_media.sql
58|58/23e2e_cross_signing_keys_idx.sql
58|58/24drop_event_json_index.sql
58|58/25user_external_ids_user_id_idx.sql
58|58/26access_token_last_validated.sql
58|58/27local_invites.sql
58|58/28drop_last_used_column.sql.sqlite
59|59/01ignored_user.py
59|59/02shard_send_to_device.sql
59|59/04_event_auth_chains.sql
59|59/04drop_account_data.sql
59|59/05cache_invalidation.sql
59|59/06chain_cover_index.sql
59|59/06shard_account_data.sql
59|59/07shard_account_data_fix.sql
59|59/08delete_pushers_for_deactivated_accounts.sql
59|59/08delete_stale_pushers.sql
59|59/09rejected_events_metadata.sql
59|59/10delete_purged_chain_cover.sql
59|59/11add_knock_members_to_stats.sql
59|59/12account_validity_token_used_ts_ms.sql
59|59/12presence_stream_instance.sql
59|59/13users_to_send_full_presence_to.sql
59|59/15locks.sql
59|59/16federation_inbound_staging.sql
59|59/14refresh_tokens.sql
61|61/01insertion_event_lookups.sql
61|61/02drop_redundant_room_depth_index.sql
61|61/03recreate_min_depth.py
62|62/01insertion_event_extremities.sql
63|63/01create_registration_tokens.sql
63|63/02delete_unlinked_email_pushers.sql
63|63/02populate-rooms-creator.sql
63|63/03session_store.sql
63|63/04add_presence_stream_not_offline_index.sql
64|64/01msc2716_chunk_to_batch_rename.sql.sqlite
65|65/01msc2716_insertion_event_edges.sql
65|65/03remove_hidden_devices_from_device_inbox.sql
65|65/04_local_group_updates.sql
65|65/05_remove_room_stats_historical_and_user_stats_historical.sql
65|65/06remove_deleted_devices_from_device_inbox.sql
65|65/07_arbitrary_relations.sql
65|65/08_device_inbox_background_updates.sql
65|65/10_expirable_refresh_tokens.sql
65|65/11_devices_auth_provider_session.sql
67|67/01drop_public_room_list_stream.sql
68|68/01event_columns.sql
68|68/02_msc2409_add_device_id_appservice_stream_type.sql
68|68/03_delete_account_data_for_deactivated_accounts.sql
68|68/04_refresh_tokens_index_next_token_id.sql
68|68/04partial_state_rooms.sql
68|68/05_delete_non_strings_from_event_search.sql.sqlite
68|68/05partial_state_rooms_triggers.py
68|68/06_msc3202_add_device_list_appservice_stream_type.sql
69|69/01as_txn_seq.py
69|69/01device_list_oubound_by_room.sql
69|69/02cache_invalidation_index.sql
70|70/01clean_table_purged_rooms.sql
70|70/08_state_group_edges_unique.sql
71|71/01rebuild_event_edges.sql.sqlite
71|71/01remove_noop_background_updates.sql
71|71/02event_push_summary_unique.sql
72|72/01add_room_type_to_state_stats.sql
72|72/01event_push_summary_receipt.sql
72|72/02event_push_actions_index.sql
72|72/03bg_populate_events_columns.py
72|72/03drop_event_reference_hashes.sql
72|72/03remove_groups.sql
72|72/04drop_column_application_services_state_last_txn.sql.sqlite
72|72/05receipts_event_stream_ordering.sql
72|72/05remove_unstable_private_read_receipts.sql
72|72/06add_consent_ts_to_users.sql
72|72/06thread_notifications.sql
72|72/07force_update_current_state_events_membership.py
72|72/07thread_receipts.sql.sqlite
72|72/08thread_receipts.sql
72|72/09partial_indices.sql.sqlite
73|73/01event_failed_pull_attempts.sql
73|73/02add_pusher_enabled.sql
73|73/02room_id_indexes_for_purging.sql
73|73/03pusher_device_id.sql
73|73/03users_approved_column.sql
73|73/04partial_join_details.sql
73|73/04pending_device_list_updates.sql
73|73/05old_push_actions.sql.sqlite
73|73/06thread_notifications_thread_id_idx.sql

@fuji246
Copy link
Author

fuji246 commented Nov 3, 2022

Is this migration issue? Anything I can do to get out of this state, like manually create/remove tables.

@DMRobertson
Copy link
Contributor

I just deleted the table in "homeserver.db".

You should NOT manually make edits to the Synapse database at all. Doing so without knowing what you're doing is highly unsafe and we cannot support anyone who has done so without our guidance.

  • Which version did you upgrade from?
  • Which version did you upgrade to?

@giorgiocanale
Copy link

Same problem here, while upgrading.
I've done some check, looking also at the upgrade script (in my case the 73/80 one).

It seems as if the upgrade script assume that in tha DB there are 2 tables
receipts_linearized and receipts_graph, than it shoud create 2 new table with _new after the name, with new constraints, copy data from the original ones, drop those and rename new tables to remove the _new suffix.

In my case I had in my db both tables with _new name, with data, and did not have the tables without the _new prefix.

I did the following:

  • backup the dp
  • renamed both table with
    ```
    ALTER TABLE receipts_linearized_new RENAME TO receipts_linearized;
    ALTER TABLE receipts_graph_new RENAME TO receipts_graph;
- start the service, that was able to execute the upgrade script

@squahtx
Copy link
Contributor

squahtx commented Nov 7, 2022

Is this migration issue? Anything I can do to get out of this state, like manually create/remove tables.

What might have happened is that Synapse got killed or timed out while running 73/08thread_receipts_non_null.sql.sqlite for the first time. Since sqlite migrations aren't run in a transaction, the next time 73/08thread_receipts_non_null.sql.sqlite ran, it complained that receipts_linearized_new already exists.

@fuji246 To get the database back into a good state, we need to first figure out what its current state is. Can you post the output of the following commands?

sqlite> .tables receipts_%
sqlite> SELECT COUNT(*) FROM receipts_graph;
sqlite> SELECT COUNT(*) FROM receipts_graph_new;
sqlite> SELECT COUNT(*) FROM receipts_linearized;
sqlite> SELECT COUNT(*) FROM receipts_linearized_new;

@squahtx squahtx added the X-Needs-Info This issue is blocked awaiting information from the reporter label Nov 7, 2022
@squahtx
Copy link
Contributor

squahtx commented Nov 7, 2022

#14100 describes a related problem with long running migrations.

@fuji246
Copy link
Author

fuji246 commented Nov 9, 2022

oops, I already deleted the database and reset everything.

@squahtx squahtx changed the title can't launch after upgrade [sqlite] can't launch after upgrade Nov 9, 2022
@squahtx squahtx changed the title [sqlite] can't launch after upgrade [sqlite] can't launch after upgrade due to failing database migration Nov 9, 2022
@squahtx
Copy link
Contributor

squahtx commented Nov 9, 2022

oops, I already deleted the database and reset everything.

In that case I'll close the issue.

I've filed #14401 to track the general case of schema deltas timing out.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

5 participants