Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Separated Statistics [2/7ish] #5889

Merged
merged 30 commits into from
Aug 28, 2019
Merged

Separated Statistics [2/7ish] #5889

merged 30 commits into from
Aug 28, 2019

Conversation

reivilibre
Copy link
Contributor

Signed-off-by: Olivier Wilkinson (reivilibre) olivier@librepush.net

This PR is the second in a series of PRs replacing #5847, which does the following:

  • Adds the schema for stats
  • Adds the function for updating stats
    • (But does not include the function for performing old collection yet)

These PRs will be merged into an intermediate branch (#5879) as some features may be broken if not all the PRs are applied at once.

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Old collection is not included in this commit

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
@reivilibre reivilibre requested a review from a team August 20, 2019 13:10
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think mainly I'm a bit confused about the expected life cycle here? It feels like we're inserting dirty rows into the table and then updating them when we encounter them? Can you outline how this is expected to work please :)

synapse/storage/schema/delta/56/stats_separated1.sql Outdated Show resolved Hide resolved
synapse/storage/schema/delta/56/stats_separated1.sql Outdated Show resolved Hide resolved
synapse/storage/schema/delta/56/stats_separated1.sql Outdated Show resolved Hide resolved
room & user statistics.
"""
_run_create_generic("room", cursor, database_engine)
_run_create_generic("user", cursor, database_engine)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be a lot clearer to just have two schema files *.sql.postges and *.sql.sqlite and list the create index clauses. This code generation is a lot longer than the eight lines of sql it generates :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be – but afaict there isn't a mechanism to do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh booo, its only implemented for full schemas and not deltas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erikjohnston With #5911, this should be resolved, I hope?

synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved
reivilibre and others added 7 commits August 20, 2019 15:02
Co-Authored-By: Erik Johnston <erik@matrix.org>
Co-Authored-By: Erik Johnston <erik@matrix.org>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per IRL let's change the tables slightly so we can just do upserts to simplify the logic.

This obviates the need for old collection, but comes at the minor cost
of not being able to track historical stats or per-slice fields until
after the statistics regenerator is finished.

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
@reivilibre reivilibre self-assigned this Aug 27, 2019
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Copy link
Member

@erikjohnston erikjohnston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Mostly some code cleanup, but a few niggles:

  1. Can we use timestamps in milliseconds please, which is what the rest of the code base uses (yes, its space inefficient for this but consistency ftw).
  2. Can we keep the documentation until the last PR please, in case it changes and we forget about it.
  3. We'll need to figure out the background update TODO before this lands on develop, but that can wait until future PR

synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved
synapse/storage/stats.py Outdated Show resolved Hide resolved

additive_relatives = {
key: fields.get(key, 0)
for key in abs_field_names
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we looking at absolute ABSOLUTE_STATS_FIELDS to figure out which ones are additive?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment which should hopefully clarify this – let me know if not.

synapse/storage/stats.py Outdated Show resolved Hide resolved
reivilibre and others added 5 commits August 27, 2019 13:26
Co-Authored-By: Erik Johnston <erik@matrix.org>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Should not be too much of a performance concern as this code won't be
hit on Postgres, which large deployments should be using.

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
`absolute_fields` being None shouldn't preclude completion of a current
stats row.

Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
synapse/storage/stats.py Outdated Show resolved Hide resolved
reivilibre and others added 2 commits August 28, 2019 09:01
Co-Authored-By: Erik Johnston <erik@matrix.org>
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
@reivilibre reivilibre merged commit cc66cf1 into rei/rss_target Aug 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants