Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation plan: Allow cancellation of prediction while running prompt #1

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

0xfacade
Copy link
Owner

@0xfacade 0xfacade commented May 25, 2023

Implementation plan for the backend part of Allow cancellation of prediction while running prompt.

Planned changes: cancel prediction when the event source is closed in the UI.

Necessary steps:

  • in the oasst_inference_server that communicates directly with the UI, catch the asyncio.CancelledError that indicates that the event stream was closed by the client (see example in documentation of sse-starlette) - we will use this to indicate that the generation should be cancelled
  • propagate the cancellation by closing the stream to the worker defined in basic_hf_server.py
  • also catch the CancelledError in basic_hf_server.py; when this happens, set a flag that indicates that the inference should be stopped
  • add a stopping criterion to the model that checks if the flag is set

Check the diff in this MR for the exact lines where I would change something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant