Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Delete unreferenced state groups during purge #4006

Merged
merged 12 commits into from
Oct 30, 2018

Conversation

erikjohnston
Copy link
Member

No description provided.

@erikjohnston erikjohnston force-pushed the erikj/purge_state_groups branch 3 times, most recently from 56a33b0 to b07c0f6 Compare October 4, 2018 15:02
@erikjohnston erikjohnston requested a review from a team October 4, 2018 15:11
richvdh
richvdh previously requested changes Oct 9, 2018
def _is_state_group_referenced(self, txn, state_group):
"""Checks if a given state group is referenced, or is safe to delete.

A state groups is referenced if it or any of its descendants are
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a groups


A state groups is referenced if it or any of its descendants are
pointed at by an event. (A descendant is a group which has the given
state_group as a prev group)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

each state group only has one prev group, so this doesn't really make sense. "A descendant is a state_group whose chain of prev_groups includes the given state_group." ?

while state_groups_to_search:
state_group = state_groups_to_search.pop() # Next state group to check

is_referenced = self._simple_select_one_onecol_txn(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem here is that we might be about to delete a bunch of the events :/. I think you need a WHERE event_id NOT IN events_to_delete.

state_groups_searched = set()

while state_groups_to_search:
state_group = state_groups_to_search.pop() # Next state group to check
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we not do this for more than one state group at a time?

@@ -1082,6 +1133,12 @@ def __init__(self, db_conn, hs):
columns=["state_key"],
where_clause="type='m.room.member'",
)
self.register_background_index_update(
self.EVENT_STATE_GROUP_INDEX_UPDATE_NAME,
index_name="event_to_state_groups_sg_index",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

worth remembering that we already have this on matrix.org, so would ideally not spend ages building another copy of the index.

# updating them
continue

if not self._is_state_group_referenced(txn, sg):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you're going to reference methods in StateGroupWorkerStore, can you make EventsStore inherit from it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, yes. I just assumed we already were (since we use functions from there elsewhere I think).

@erikjohnston erikjohnston requested a review from a team October 19, 2018 15:21
@erikjohnston
Copy link
Member Author

I've shuffled things about a bit to checking of unreferenced state groups in batches, rather than one at a time, as well as simplifying the initial query.

@erikjohnston erikjohnston removed their assignment Oct 22, 2018
Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks generally sane to me. a few comments/questions

@@ -204,7 +205,8 @@ def f(self, *args, **kwargs):

# inherits from EventFederationStore so that we can call _update_backward_extremities
# and _handle_mult_prev_events (though arguably those could both be moved in here)
class EventsStore(EventFederationStore, EventsWorkerStore, BackgroundUpdateStore):
class EventsStore(StateGroupWorkerStore, EventFederationStore, EventsWorkerStore,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment on why we are inheriting from StateGroupWorkerStore mightn't hurt; otoh perhaps it's obvious enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its obvious enough, especially since we use it in a fair few places and not just in purge history

@@ -1041,6 +1041,86 @@ def _count_state_group_hops_txn(self, txn, state_group):

return count

def _find_unreferenced_groups(self, txn, state_groups):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given this is now inherently tied to purging via the join on events_to_purge, I think it might make more sense to leave it alongside the other purge code.

next_to_search = set()
else:
lst = list(next_to_search)
current_search = set(lst[:100])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to do:

current_search = set(islice(next_to_search, 100))
next_to_search -= current_search

""" % (",".join("?" for _ in current_search),)
txn.execute(sql, list(current_search))

referenced = set(sg for sg, cnt in txn if cnt > 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

surely cnt will never be zero?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(so why not do SELECT DISTINCT state_group rather than the GROUP BY)


next_to_search.update(row["state_group"] for row in rows)
# We don't bother re-handling groups we've already seen
next_to_search -= state_groups_seen
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we not do this before adding to next_to_search?

prevs = set(row["state_group"] for row in rows)
prevs -= state_groups_seen
next_to_search |= prevs
state_groups_seen |= prevs

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@erikjohnston erikjohnston merged commit 0794504 into develop Oct 30, 2018
@rubo77
Copy link
Contributor

rubo77 commented Oct 30, 2018

Is this also deleting leftovers of the old Nuke Script?

hawkowl added a commit that referenced this pull request Nov 19, 2018
Features
--------

- Include flags to optionally add `m.login.terms` to the registration flow when consent tracking is enabled.
([\#4004](#4004), [\#4133](#4133),
[\#4142](#4142), [\#4184](#4184))
- Support for replacing rooms with new ones ([\#4091](#4091), [\#4099](#4099),
[\#4100](#4100), [\#4101](#4101))

Bugfixes
--------

- Fix exceptions when using the email mailer on Python 3. ([\#4095](#4095))
- Fix e2e key backup with more than 9 backup versions ([\#4113](#4113))
- Searches that request profile info now no longer fail with a 500. ([\#4122](#4122))
- fix return code of empty key backups ([\#4123](#4123))
- If the typing stream ID goes backwards (as on a worker when the master restarts), the worker's typing handler will no longer erroneously report rooms containing new
typing events. ([\#4127](#4127))
- Fix table lock of device_lists_remote_cache which could freeze the application ([\#4132](#4132))
- Fix exception when using state res v2 algorithm ([\#4135](#4135))
- Generating the user consent URI no longer fails on Python 3. ([\#4140](#4140),
[\#4163](#4163))
- Loading URL previews from the DB cache on Postgres will no longer cause Unicode type errors when responding to the request, and URL previews will no longer fail if
the remote server returns a Content-Type header with the chartype in quotes. ([\#4157](#4157))
- The hash_password script now works on Python 3. ([\#4161](#4161))
- Fix noop checks when updating device keys, reducing spurious device list update notifications. ([\#4164](#4164))

Deprecations and Removals
-------------------------

- The disused and un-specced identicon generator has been removed. ([\#4106](#4106))
- The obsolete and non-functional /pull federation endpoint has been removed. ([\#4118](#4118))
- The deprecated v1 key exchange endpoints have been removed. ([\#4119](#4119))
- Synapse will no longer fetch keys using the fallback deprecated v1 key exchange method and will now always use v2.
([\#4120](#4120))

Internal Changes
----------------

- Fix build of Docker image with docker-compose ([\#3778](#3778))
- Delete unreferenced state groups during history purge ([\#4006](#4006))
- The "Received rdata" log messages on workers is now logged at DEBUG, not INFO. ([\#4108](#4108))
- Reduce replication traffic for device lists ([\#4109](#4109))
- Fix `synapse_replication_tcp_protocol_*_commands` metric label to be full command name, rather than just the first character
([\#4110](#4110))
- Log some bits about room creation ([\#4121](#4121))
- Fix `tox` failure on old systems ([\#4124](#4124))
- Add STATE_V2_TEST room version ([\#4128](#4128))
- Clean up event accesses and tests ([\#4137](#4137))
- The default logging config will now set an explicit log file encoding of UTF-8. ([\#4138](#4138))
- Add helpers functions for getting prev and auth events of an event ([\#4139](#4139))
- Add some tests for the HTTP pusher. ([\#4149](#4149))
- add purge_history.sh and purge_remote_media.sh scripts to contrib/ ([\#4155](#4155))
- HTTP tests have been refactored to contain less boilerplate. ([\#4156](#4156))
- Drop incoming events from federation for unknown rooms ([\#4165](#4165))
@erikjohnston erikjohnston deleted the erikj/purge_state_groups branch December 12, 2018 15:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants