Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In the CassandraStorage, remove repair run table scans #105

Conversation

michaelsembwever
Copy link
Member

In CassandraStorage replace the table scan on repair_run with a async break-down of per cluster run-throughs of known run IDs.

@michaelsembwever michaelsembwever force-pushed the mck/cassandra-remove-repair-run-table-scans branch from 64315d0 to 791c6a8 Compare May 26, 2017 04:58
michaelsembwever added a commit that referenced this pull request May 26, 2017
…nc break-down of per cluster run-throughs of known run IDs.

 ref: #105
@tlpsonarqube
Copy link
Collaborator

SonarQube analysis reported 2 issues

  • MINOR 2 minor

Watch the comments in this conversation to review them.

return repairRuns;
return getClusters().stream()
// Grab all ids for the given cluster name
.map((cluster) -> getRepairRunIdsForCluster(cluster.getName()))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MINOR Remove the parentheses around the "cluster" parameter rule

// Grab repair runs asynchronously for all the ids returned by the index table
.flatMap((Collection<UUID> repairRunIds)
-> repairRunIds.stream()
.map((repairRunId) -> session.executeAsync(getRepairRunPrepStmt.bind(repairRunId))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MINOR Remove the parentheses around the "repairRunId" parameter rule

@michaelsembwever michaelsembwever force-pushed the mck/cassandra-remove-batch-statements branch from b49248e to 9e26043 Compare May 28, 2017 05:25
@michaelsembwever michaelsembwever force-pushed the mck/cassandra-remove-repair-run-table-scans branch from 791c6a8 to 34a48f5 Compare May 28, 2017 05:26
michaelsembwever added a commit that referenced this pull request May 28, 2017
…nc break-down of per cluster run-throughs of known run IDs.

 ref: #105
@michaelsembwever michaelsembwever force-pushed the mck/cassandra-remove-batch-statements branch from 9e26043 to 4441ea7 Compare June 1, 2017 03:36
@michaelsembwever michaelsembwever changed the base branch from mck/cassandra-remove-batch-statements to mck/cassandra-improvements-94 June 1, 2017 06:58
@michaelsembwever michaelsembwever force-pushed the mck/cassandra-remove-repair-run-table-scans branch from 34a48f5 to 10f0e6b Compare June 1, 2017 07:52
michaelsembwever added a commit that referenced this pull request Jun 1, 2017
…nc break-down of per cluster run-throughs of known run IDs.

 ref: #105
@michaelsembwever michaelsembwever force-pushed the mck/cassandra-remove-repair-run-table-scans branch from 10f0e6b to 3cf5733 Compare June 1, 2017 07:55
@michaelsembwever michaelsembwever merged commit 3cf5733 into mck/cassandra-improvements-94 Jun 1, 2017
@michaelsembwever michaelsembwever force-pushed the mck/cassandra-improvements-94 branch from f2e57ce to 3cf5733 Compare June 1, 2017 09:22
@michaelsembwever michaelsembwever deleted the mck/cassandra-remove-repair-run-table-scans branch June 1, 2017 12:00
adejanovski added a commit that referenced this pull request Jun 1, 2017
* Cassandra performance: Replace sequence ids with time-based UUIDs
  Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over.

ref:
 - #99
 - #94
 - #99 (comment)

* Simplify the creation of repair runs and their segments.
 Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly.
 Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations.
 This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table.

ref:
 - #94
 - #101

* In CassandraStorage implement segments as clustering keys within the repair_run table.
Change required in IStorage so to identify a segment both by runId and segmentId.

ref:
 - #94
 - #102

* Fix number of parallel repair computation
Downgrade to Dropwizard 1.0.7 and Guava 19.0 to fix dependency issues
Make repair manager schedule cycle configurable (was 30s hardcoded)

ref: #108

* In CassandraStorage replace the table scan on `repair_run` with a async break-down of per cluster run-throughs of known run IDs.

 ref: #105
adejanovski pushed a commit that referenced this pull request Jun 26, 2017
…nc break-down of per cluster run-throughs of known run IDs.

 ref: #105
michaelsembwever pushed a commit that referenced this pull request Jun 27, 2017
* Cassandra performance: Replace sequence ids with time-based UUIDs
  Makes the schema changes in a separate migration step, so that data in the repair_unit and repair_schedule tables can be migrated over.

ref:
 - #99
 - #94
 - #99 (comment)

* Simplify the creation of repair runs and their segments.
 Repair runs and their segments are one unit of work in concept and the persistence layer should be designed accordingly.
 Previous they were separated because the concern of sequence generation for IDs were exposed in the code. This is now encapsulated within storage implementations.
 This work allows the CassandraStorage to implement segments as clustering keys within the repair_run table.

ref:
 - #94
 - #101

* In CassandraStorage implement segments as clustering keys within the repair_run table.
Change required in IStorage so to identify a segment both by runId and segmentId.

ref:
 - #94
 - #102

* Fix number of parallel repair computation
Downgrade to Dropwizard 1.0.7 and Guava 19.0 to fix dependency issues
Make repair manager schedule cycle configurable (was 30s hardcoded)

ref: #108

* In CassandraStorage replace the table scan on `repair_run` with a async break-down of per cluster run-throughs of known run IDs.

 ref: #105
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants