-
Notifications
You must be signed in to change notification settings - Fork 638
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add broker-side job stream service API #11707
Labels
component/broker
kind/toil
Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
version:8.2.0
Marks an issue as being completely or in parts released in 8.2.0
Comments
6 tasks
ghost
pushed a commit
that referenced
this issue
Mar 6, 2023
11945: Add StreamRegistry on broker side to keep track of open streams r=deepthidevaki a=deepthidevaki ## Description Adds a StreamRegistry to keep track of open streams between gateway and broker. This is in preparation for #11707. ## Related issues Related #11707 Co-authored-by: Deepthi Devaki Akkoorath <deepthidevaki@gmail.com>
ghost
pushed a commit
that referenced
this issue
Mar 6, 2023
11939: Adds broker job stream service r=npepinpe a=npepinpe ## Description This PR adds a server endpoint for the job stream API, as well as the minimum communication protocol. The endpoint can receive requests to: 1. Add streams to the registry 2. Remove a specific stream from the registry 3. Remove all streams associated with a member from the registry Additionally, when a member is removed from the cluster, all streams associated with it are removed. In the future we can think about handling `REACHABILITY_CHANGED` to temporarily disable certain streams without completely removing them. The `LongPollingJobNotification` was replaced with scaffolding for the `JobGatewayStreamer`, but the functionality is largely the same. This will be expanded in a follow up PR. ## Related issues related to #11707 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com> Co-authored-by: Nicolas Pepin-Perreault <43373+npepinpe@users.noreply.github.com>
Additionally:
|
@deepthidevaki once you've merged the last PR, as discussed, please add a gauge for the registered stream count, trace logging of every job pushed, and debug logging of the stream request handler (adding, removing, etc.) |
ghost
pushed a commit
that referenced
this issue
Mar 9, 2023
11961: Push jobs to registered streams r=npepinpe a=npepinpe ## Description Implements a job specific `GatewayStreamer` and `GatewayStream`, which can push jobs out to specific streams once they've been activated by the engine. The implementation is currently very simple: 1. Jobs are pushed out asynchronously 2. The target stream is picked randomly from the set of consumers 3. There is no retry; we try with one consumer only The `GatewayStream` also expects the jobs (i.e. payload) to be immutable, and the `ErrorHandler` to be thread-safe. This PR does not close the issue, as we're still missing adding metrics and some logging (which will then close the issue). ## Related issues related to #11707 Co-authored-by: Nicolas Pepin-Perreault <nicolas.pepin-perreault@camunda.com>
This issue was closed.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
component/broker
kind/toil
Categorizes an issue or PR as general maintenance, i.e. cleanup, refactoring, etc.
version:8.2.0
Marks an issue as being completely or in parts released in 8.2.0
Description
Job workers/streams will be managed in-memory by the broker. Since workers are always interested in all partitions, this can be a long living, singleton instance on the broker side.
The external job activator will be able to query this service to activate jobs.
In the prototype, I used SWIM to register/unregister workers/streams. However, I wasn't differentiating between workers other than by type. Since we want to keep the same API and treat each worker instance differently based on their properties, we cannot do this here.
There will be a new API in the broker which can receive commands from the gateway (i.e. using the protocol), with only two RPCs:
AddWorker
RemoveWorker
RemoveWorkers
An
AddWorker
request should contain the job type, worker's activation properties, and the gateway/recipient member ID where a job would be pushed.Workers should be uniquely identifiable by their gateway (i.e. push recipient member ID) and their activation properties. Two workers with the same recipient and the same properties are logically equivalent and do not need to be registered multiple times.
Workers can be removed on demand via the
RemoveWorker
API, using their recipient and properties to identify them.A
RemoveWorkers
request is received from a gateway (with its member ID as param), and all workers associated with it are removed. If a gateway dies (as detected via SWIM), the same should happen.Note that we are not implementing pushing jobs here yet, just managing the lifecycle of workers. With this done, we can then provide a dummy external job activator implementation (unusable for production of course) which can activate jobs. We can then test that the jobs are activated with the right properties, and variables are collected, etc.
I would recommend further breaking down this issue, or are least completing it in multiple PRs.
The text was updated successfully, but these errors were encountered: