-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collect protocol implementation #105
Comments
When servicing an
Leader needs similar enforcement. This should be done in the
|
Implements the helper's `aggregate_share` endpoint. The assumption is that the process of preparing input shares into output shares will create rows in `batch_aggregations` and update the `checksum` and `aggregate_share` columns as individual reports are prepared. Then, all the `/aggregate_share` handler has to do is sum the aggregate shares it finds. Note that this does not include support for the protocol changes in [1], nor does it include enforcement of a task's `max_batch_lifetime`. [1]: ietf-wg-ppm/draft-ietf-ppm-dap#224 Part of #105
`AggregateShareReq` now includes the aggregation parameter[1], which we now reflect in `messages::AggregateShareReq`. We store the encoded aggregation parameter in the datastore, and rename `batch_aggregations` to `batch_unit_aggregations` along the way, for clarity. [1]: ietf-wg-ppm/draft-ietf-ppm-dap#224 Part of #105
`AggregateShareReq` now includes the aggregation parameter[1], which we now reflect in `messages::AggregateShareReq`. We store the encoded aggregation parameter in the datastore, and rename `batch_aggregations` to `batch_unit_aggregations` along the way, for clarity. [1]: ietf-wg-ppm/draft-ietf-ppm-dap#224 Part of #105
`AggregateShareReq` now includes the aggregation parameter[1], which we now reflect in `messages::AggregateShareReq`. We store the encoded aggregation parameter in the datastore, and rename `batch_aggregations` to `batch_unit_aggregations` along the way, for clarity. [1]: ietf-wg-ppm/draft-ietf-ppm-dap#224 Part of #105
`AggregateShareReq` now includes the aggregation parameter[1], which we now reflect in `messages::AggregateShareReq`. We store the encoded aggregation parameter in the datastore, and rename `batch_aggregations` to `batch_unit_aggregations` along the way, for clarity. [1]: ietf-wg-ppm/draft-ietf-ppm-dap#224 Part of #105
Adds a new database table `aggregate_share_jobs`, used by helper to store the results of successfully serviced `AggregateShareReq`s. This allows leaders to retry an `AggregateShareReq` indefinitely, provided the parameters don't change. The leader's `collect_jobs` table now also has some nullable columns where it can cache the leader and helper's aggregate shares, to similarly allow the collector to retry requests to the collect job URI. Part of #105
Adds a new database table `aggregate_share_jobs`, used by helper to store the results of successfully serviced `AggregateShareReq`s. This allows leaders to retry an `AggregateShareReq` indefinitely, provided the parameters don't change. The leader's `collect_jobs` table now also has some nullable columns where it can cache the leader and helper's aggregate shares, to similarly allow the collector to retry requests to the collect job URI. Part of #105
Adds a new database table `aggregate_share_jobs`, used by helper to store the results of successfully serviced `AggregateShareReq`s. This allows leaders to retry an `AggregateShareReq` indefinitely, provided the parameters don't change. The leader's `collect_jobs` table now also has some nullable columns where it can cache the leader and helper's aggregate shares, to similarly allow the collector to retry requests to the collect job URI. Part of #105
To enforce a task's `max_batch_lifetime`, we need to know how many times each batch unit in an `AggregateShareReq`'s `batch_interval` has been collected, that is, how many rows in `aggregate_share_jobs` have a `batch_interval` that contains the batch unit's interval. `datastore::Transaction::get_aggregate_share_job_count_by_batch_unit` is meant to be used with one batch unit interval at a time. I suspect this could be optimized into a single SQL query that checks multiple batch units at once. Part of #105
When servicing a collect request, the leader must generate a collect job URI relative to the public base URL from which the API is served and then stick that in a `Location` header. We now provide that base URL in the aggregator's config file. Part of #105
When servicing a collect request, the leader must generate a collect job URI relative to the public base URL from which the API is served and then stick that in a `Location` header. We now provide that base URL in the aggregator's config file. Part of #105
Refactors some existing code that supports the helper's `max_batch_lifetime` enforcement so that it can be re-used in the leader's `collect` endpoint. All the logic for that endpoint now moves into methods on `VdafOps`. part of #105
To enforce a task's `max_batch_lifetime`, we need to know how many times each batch unit in an `AggregateShareReq`'s `batch_interval` has been collected, that is, how many rows in `aggregate_share_jobs` have a `batch_interval` that contains the batch unit's interval. `datastore::Transaction::get_aggregate_share_job_count_by_batch_unit` is meant to be used with one batch unit interval at a time. I suspect this could be optimized into a single SQL query that checks multiple batch units at once. Part of #105
When servicing a collect request, the leader must generate a collect job URI relative to the public base URL from which the API is served and then stick that in a `Location` header. We now provide that base URL in the aggregator's config file. Part of #105
Leader now consults the task parameters to determine what base URL to use when constructing collect job URIs. This assumes that a leader will serve collect jobs from the same base URL that it serves other endpoints like `/upload` or `/collect`. Part of #105
Refactors some existing code that supports the helper's `max_batch_lifetime` enforcement so that it can be re-used in the leader's `collect` endpoint. All the logic for that endpoint now moves into methods on `VdafOps`. part of #105
Refactors some existing code that supports the helper's `max_batch_lifetime` enforcement so that it can be re-used in the leader's `collect` endpoint. All the logic for that endpoint now moves into methods on `VdafOps`. part of #105
Adds a warp filter for path `/collect_jobs/{collect_job_id}` to the leader. Adds support for querying collect jobs from the datastore as well as updating them with helper and leader aggregate shares. The latter is currently only needed for tests, but will soon be used when running collect jobs. Part of #105
Adds a warp filter for path `/collect_jobs/{collect_job_id}` to the leader. Adds support for querying collect jobs from the datastore as well as updating them with helper and leader aggregate shares. The latter is currently only needed for tests, but will soon be used when running collect jobs. Part of #105
Adds a warp filter for path `/collect_jobs/{collect_job_id}` to the leader. Adds support for querying collect jobs from the datastore as well as updating them with helper and leader aggregate shares. The latter is currently only needed for tests, but will soon be used when running collect jobs. Part of #105
Adds a warp filter for path `/collect_jobs/{collect_job_id}` to the leader. Adds support for querying collect jobs from the datastore as well as updating them with helper and leader aggregate shares. The latter is currently only needed for tests, but will soon be used when running collect jobs. Part of #105
Factors logic for enumerating tasks and creating per-task jobs out of `aggregation_job_creator` and into a new module. Also adds a skeleton of `collect_job_creator` to show how this is used across multiple binary targets. Part of #105
Factors logic for discovering incomplete jobs out of `aggregation_job_driver` and into a new module. Adds a skeleton of `collect_job_creator` to show how this is used across multiple binary targets. Part of #105
Factors logic for discovering incomplete jobs out of `aggregation_job_driver` and into a new module. Adds a skeleton of `collect_job_creator` to show how this is used across multiple binary targets. Part of #105
Factors logic for discovering incomplete jobs out of `aggregation_job_driver` and into a new module. Adds a skeleton of `collect_job_creator` to show how this is used across multiple binary targets. Part of #105
Adds support for acquiring and releasing leases on collect jobs to the datastore module, which will soon be used by the collect job driver to drive jobs. Part of #105
Adds support for acquiring and releasing leases on collect jobs to the datastore module, which will soon be used by the collect job driver to drive jobs. Part of #105
Adds support for acquiring and releasing leases on collect jobs to the datastore module, which will soon be used by the collect job driver to drive jobs. Part of #105
Fleshes out the implementation of the Janus collect job driver. Some existing logic used for the helper's `/aggregate_share` handler is refactored into `mod aggregate_share` so it can be used in `collect_job_driver`. Additionally, the existing methods `update_collect_job_*` methods on `datastore::Transaction` are collapsed into a single method that sets helper aggregate share, leader aggregate share, report count and checksum in a single operation. This simplifies the logic of the collect job driver since it doesn't have to deal with the case where the leader's aggregate share was computed but the helper's isn't known yet. The downside is that if a collect job fails because the helper failed to compute its aggregate share, then the leader will recompute its share "from scratch" the next time the collect job is run. If helpers fail often enough to make caching the leader aggregate share worthwhile, then we probably have bigger problems than this performance issue. Part of #105
Fleshes out the implementation of the Janus collect job driver. Some existing logic used for the helper's `/aggregate_share` handler is refactored into `mod aggregate_share` so it can be used in `collect_job_driver`. Additionally, the existing methods `update_collect_job_*` methods on `datastore::Transaction` are collapsed into a single method that sets helper aggregate share, leader aggregate share, report count and checksum in a single operation. This simplifies the logic of the collect job driver since it doesn't have to deal with the case where the leader's aggregate share was computed but the helper's isn't known yet. The downside is that if a collect job fails because the helper failed to compute its aggregate share, then the leader will recompute its share "from scratch" the next time the collect job is run. If helpers fail often enough to make caching the leader aggregate share worthwhile, then we probably have bigger problems than this performance issue. Part of #105
All this work is done, closing! |
To implement the collect protocol, we need:
[leader]/collect
endpoint (Skeleton of the leader/collect
endpoint #93, Store collect jobs in the database #97)[helper]/aggregate_share
endpoint (Helper/aggregate_share
endpoint #103, Store aggregate share requests serviced by helper #111)The text was updated successfully, but these errors were encountered: