Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Cleanup
handle_worker()
in preparation for #2815 (Stop generation e…
…arly) (#3573) In this PR, I clean up the `handle_worker()` method a bit so that I can later extend it (in a future PR). There are no functional changes in this PR. Changes: - collect the many variables in a new class `HandleWorkerContext` that also features methods for initialization and destruction - collect methods to handle updating the session in new class `SessionManager` - move management of futures into a new class `FuturesManager` - extract the logic for handling a work request and a worker response from the main loop into their own respective functions The last change is the most important one for my future changes. In the main loop of `handle_worker()`, we were already waiting for two different types of futures: newly dequeued work requests from the Redis work queue, and responses from the worker received over the websocket. I'll need to add a third type of future next that allows us to listen to requests to stop generating text (#2815). The results of the different futures used to be differentiated based on their return type, which was very hard to read. I've created a decorator in `FuturesManager` that wraps the awaitable in another awaitable that returns a tuple, where the first entry is a `FutureType` enum value, and the second value is the result of awaiting the passed in awaitable. This makes it easy to distinguish what type of result was received. I tested my changes by spinning up the inference server + worker with `docker compose`. Then I used the `text-client` to interface with the server.
- Loading branch information