-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cassandra performance improvements #94
Comments
… cache a floor for the sequence number to reduce lookups. ref: #94
… cache a floor for the sequence number to reduce lookups. ref: #94
The pull request in #95 Tested against a non-incremental repair with 1467 segments, it reduced (before activation of the repair) the number of cql requests by over 8k. |
The pull request in #96 addresses these three possibilities. |
The collapsed repair_run table looks interesting and we would have to test partitions sizes in case of big clusters using 256 vnodes. Just like you did for the generation of ids, we could hold a local cache of repair segments, removing finished ones and keeping the rest. Before processing a segment we could query the database to check if it's up to date and move on accordingly. This would bring down to a single query for a single partition most of the time, with an overhead only on init. How does that sound @michaelsembwever ? |
Adding more "local caches" would be a band-aid to the design of the persistence layer which has too small a granularity. I'd rather see IStorage re-assessed. Using the constraint that one IStorage object works against an owned shard this could simplify a number of things:
This would lose some durability of the data if the reaper process died. But i think this is irrelevant. Reaper can either restore transient state, or in situations like a un-activitated repair run doesn't care to the durability. |
What's the largest number of segments we're seen? |
This has been further addressed (and superseded) by #99 The sequence number was a postgresql (sql) optimisation that imposed itself upon the code's design. By replacing all IDs in the codebase with UUIDs, we
This reduces those 14k cql requests to an even lower number as the |
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101
This has been addressed in #102 |
@michaelsembwever, I've tested with the row cache activated on the repair_segment table and we mostly hit the cache only :
Nodetool info :
|
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101
ref: - #99 - #94 Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: #99 (comment) Move the new schema migration to 003 as 002 already exists in master Recover Cassandra migration 002 Fix incorrect type used to get incremental repair value during schema migration
ref: - #99 - #94 Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: #99 (comment) Move the new schema migration to 003 as 002 already exists in master Recover Cassandra migration 002 Fix incorrect type used to get incremental repair value during schema migration
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101
* Cassandra performance: Replace sequence ids with time-based UUIDs ref: - #99 - #94 * Simplify the creation of repair runs and their segments. Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101 * SQUASH ME Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: #99 (comment) * Fix file names and 002 migration file
Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: - #99 - #94 - #99 (comment)
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101
* Cassandra performance: Replace sequence ids with time-based UUIDs Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: - #99 - #94 - #99 (comment) * Simplify the creation of repair runs and their segments. Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101 * In CassandraStorage implement segments as clustering keys within the repair_run table. Change required in IStorage so to identify a segment both by runId and segmentId. ref: - #94 - #102 * Fix number of parallel repair computation Downgrade to Dropwizard 1.0.7 and Guava 19.0 to fix dependency issues Make repair manager schedule cycle configurable (was 30s hardcoded) ref: #108 * In CassandraStorage replace the table scan on `repair_run` with a async break-down of per cluster run-throughs of known run IDs. ref: #105
Performance issues were fixed with release 0.6.0. Closing. |
Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: - #99 - #94 - #99 (comment)
Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101
* Cassandra performance: Replace sequence ids with time-based UUIDs Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over. ref: - #99 - #94 - #99 (comment) * Simplify the creation of repair runs and their segments. Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly. Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations. This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table. ref: - #94 - #101 * In CassandraStorage implement segments as clustering keys within the repair_run table. Change required in IStorage so to identify a segment both by runId and segmentId. ref: - #94 - #102 * Fix number of parallel repair computation Downgrade to Dropwizard 1.0.7 and Guava 19.0 to fix dependency issues Make repair manager schedule cycle configurable (was 30s hardcoded) ref: #108 * In CassandraStorage replace the table scan on `repair_run` with a async break-down of per cluster run-throughs of known run IDs. ref: #105
Relevant to #85
Observations:
cluster
/repair_run/state
, causing scheduled 10s repeated selects onrepair_run
,repair_unit
, andrepair_run_by_cluster
,The rest endpoint
/repair_run/state
repeats the follow every 10 seconds…During the creation of a repair run, non-incremental with 1467 segments, ~14k requests were logged.
A distribution of these requests is as follows:
What stands out above is …
UPDATE repair_id SET id=N WHERE id_type = 'repair_segment' IF id = N;
, andINSERT INTO repair_id (id_type, id) VALUES('repair_segment', N) IF NOT EXISTS;
.These appear to be identical statements, the latter as a LWT.
Neither of them are used as prepared statements in
CassandraStorage
. (The 8.2k select requests against the same table are also not prepared statements.)Otherwise the multiple inserts and reads on
repair_segment
could be collapsed by making repair_segment_id a clustering key within the repair_run table. Furthermore a small row cache (since reads against the one repair_id are particularly hot) would have significant impact.Possibilities:
cluster
andrepair_run_by_cluster
(kinda expecting page cache to be doing enough already…)repair_run
repair_run
(…actually on all the tables)repair_id
CassandraStorage.getNewRepairId(..)
Maybe there's a possibility of collapsing queries by collapsing tables by making
repair_segment
a clustering key withinrepair_run
.In fact this might collapse three tables:
repair_segment
,repair_run
, andrepair_segment_by_run_id
; together with the one table like:This could then leave the regular requests to
The text was updated successfully, but these errors were encountered: