Separated Statistics [2/7ish] #5889

reivilibre · 2019-08-20T13:10:23Z

Signed-off-by: Olivier Wilkinson (reivilibre) olivier@librepush.net

This PR is the second in a series of PRs replacing #5847, which does the following:

Adds the schema for stats
Adds the function for updating stats
- (But does not include the function for performing old collection yet)

These PRs will be merged into an intermediate branch (#5879) as some features may be broken if not all the PRs are applied at once.

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Old collection is not included in this commit Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

erikjohnston

I think mainly I'm a bit confused about the expected life cycle here? It feels like we're inserting dirty rows into the table and then updating them when we encounter them? Can you outline how this is expected to work please :)

synapse/storage/schema/delta/56/stats_separated1.sql

erikjohnston · 2019-08-20T13:28:41Z

synapse/storage/schema/delta/56/stats_separated2.py

+    room & user statistics.
+    """
+    _run_create_generic("room", cursor, database_engine)
+    _run_create_generic("user", cursor, database_engine)


I think it'd be a lot clearer to just have two schema files *.sql.postges and *.sql.sqlite and list the create index clauses. This code generation is a lot longer than the eight lines of sql it generates :)

It would be – but afaict there isn't a mechanism to do so.

Oh booo, its only implemented for full schemas and not deltas.

@erikjohnston With #5911, this should be resolved, I hope?

synapse/storage/stats.py

Co-Authored-By: Erik Johnston <erik@matrix.org>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

erikjohnston

As per IRL let's change the tables slightly so we can just do upserts to simplify the logic.

This obviates the need for old collection, but comes at the minor cost of not being able to track historical stats or per-slice fields until after the statistics regenerator is finished. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

erikjohnston

Looks good. Mostly some code cleanup, but a few niggles:

Can we use timestamps in milliseconds please, which is what the rest of the code base uses (yes, its space inefficient for this but consistency ftw).
Can we keep the documentation until the last PR please, in case it changes and we forget about it.
We'll need to figure out the background update TODO before this lands on develop, but that can wait until future PR

synapse/storage/stats.py

erikjohnston · 2019-08-27T09:22:06Z

synapse/storage/stats.py

+
+        additive_relatives = {
+            key: fields.get(key, 0)
+            for key in abs_field_names


Why are we looking at absolute ABSOLUTE_STATS_FIELDS to figure out which ones are additive?

Added a comment which should hopefully clarify this – let me know if not.

synapse/storage/stats.py

Co-Authored-By: Erik Johnston <erik@matrix.org>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Should not be too much of a performance concern as this code won't be hit on Postgres, which large deployments should be using. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

`absolute_fields` being None shouldn't preclude completion of a current stats row. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

synapse/storage/stats.py

Co-Authored-By: Erik Johnston <erik@matrix.org>

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre added 2 commits August 20, 2019 13:45

Add schema for Separated Statistics

d7675e7

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Add storage function for storing stats deltas

80a1c6e

Old collection is not included in this commit Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre requested a review from a team August 20, 2019 13:10

Ack, isort!

1819563

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

erikjohnston suggested changes Aug 20, 2019

View reviewed changes

reivilibre and others added 7 commits August 20, 2019 15:02

Update synapse/storage/stats.py

b5573c0

Co-Authored-By: Erik Johnston <erik@matrix.org>

Update synapse/storage/stats.py

4a97eef

Co-Authored-By: Erik Johnston <erik@matrix.org>

Add room and user statistics documentation.

6a19f7e

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Sanitise accepted fields in _update_stats_delta_txn

981c6cf

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Clarify _update_stats_delta_txn

977310e

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Unify name of 'stats regenerator' in schema comments.

eafa8d3

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Remove needless defaults.

18a4c03

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre requested a review from erikjohnston August 20, 2019 15:13

erikjohnston suggested changes Aug 22, 2019

View reviewed changes

reivilibre added 3 commits August 22, 2019 15:40

Simplify table structure

7b657f1

This obviates the need for old collection, but comes at the minor cost of not being able to track historical stats or per-slice fields until after the statistics regenerator is finished. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Fix up SQL schema delta

e8fc180

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Fix up historical stats support.

79252d1

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre requested a review from erikjohnston August 27, 2019 07:04

reivilibre self-assigned this Aug 27, 2019

reivilibre added 2 commits August 27, 2019 08:52

Allow schema deltas to be engine-specific

c3d2bf2

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Use engine-specific delta SQL files rather than delta written in Python.

1ecd1a6

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre force-pushed the rei/rss_inc2 branch from 1ce7b71 to 1ecd1a6 Compare August 27, 2019 08:50

erikjohnston suggested changes Aug 27, 2019

View reviewed changes

Merge branch 'rei/rss_target' into rei/rss_inc2

5043ef8

reivilibre force-pushed the rei/rss_inc2 branch from c44a8cf to 5043ef8 Compare August 27, 2019 12:18

reivilibre and others added 5 commits August 27, 2019 13:26

Apply suggestions from code review

4b7bf2e

Co-Authored-By: Erik Johnston <erik@matrix.org>

Clarify _update_stats_delta_txn by adding code comments and kwargs.

81c5289

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Apply minor suggestions from review

544ba2c

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Lock tables in upsert fall-backs.

a6c1020

Should not be too much of a performance concern as this code won't be hit on Postgres, which large deployments should be using. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Code formatting (Black)

736ac58

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre added 2 commits August 27, 2019 13:50

Switch to milliseconds in room/user stats for consistency.

09cbc3a

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Don't include the room & user stats docs in this PR.

c775f31

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre mentioned this pull request Aug 27, 2019

Separated Statistics [4/7ish] #5891

Closed

reivilibre added 4 commits August 27, 2019 14:19

Remove obsolete OldCollectionRequired as old collection is obsolete.

491eaf0

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Rename room_state table to room_stats_state

11c4e50

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Update _purge_room_txn to take account of separated stats tables

62b1250

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

Fix logic error.

324f21b

`absolute_fields` being None shouldn't preclude completion of a current stats row. Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre requested a review from erikjohnston August 27, 2019 14:02

Clean up code with improved naming and hoist around functions.

1af7866

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

erikjohnston approved these changes Aug 27, 2019

View reviewed changes

synapse/storage/stats.py Outdated Show resolved Hide resolved

reivilibre and others added 2 commits August 28, 2019 09:01

Update synapse/storage/stats.py

b9f1adc

Co-Authored-By: Erik Johnston <erik@matrix.org>

Code formatting (Black)

a344ad3

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>

reivilibre merged commit cc66cf1 into rei/rss_target Aug 28, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Separated Statistics [2/7ish] #5889

Separated Statistics [2/7ish] #5889

reivilibre commented Aug 20, 2019

erikjohnston left a comment

erikjohnston Aug 20, 2019

reivilibre Aug 20, 2019

erikjohnston Aug 21, 2019

reivilibre Aug 27, 2019

erikjohnston left a comment

erikjohnston left a comment

erikjohnston Aug 27, 2019

reivilibre Aug 27, 2019

Separated Statistics [2/7ish] #5889

Separated Statistics [2/7ish] #5889

Conversation

reivilibre commented Aug 20, 2019

erikjohnston left a comment

Choose a reason for hiding this comment

erikjohnston Aug 20, 2019

Choose a reason for hiding this comment

reivilibre Aug 20, 2019

Choose a reason for hiding this comment

erikjohnston Aug 21, 2019

Choose a reason for hiding this comment

reivilibre Aug 27, 2019

Choose a reason for hiding this comment

erikjohnston left a comment

Choose a reason for hiding this comment

erikjohnston left a comment

Choose a reason for hiding this comment

erikjohnston Aug 27, 2019

Choose a reason for hiding this comment

reivilibre Aug 27, 2019

Choose a reason for hiding this comment