Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to support input validation to match OpenAI behavior. #65

Conversation

jroesch
Copy link
Member

@jroesch jroesch commented Nov 15, 2023

This PR exposes a token validation callback per-request enabling us to validate requests-post tokenization. We will mark RequestState's as invalid and then effectively cancel them when we go to process them in the worker. This will result in an error returned to the caller of the generation API.

We will use exception handling to trigger a validation error and return it to the user internally.

cc @masahi @sunggg

Copy link
Member

@sunggg sunggg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR, @jroesch! This looks good to me.
I've just finished re-organizing the compile-time information dump (e.g., no more hf config dependency under models/) and maintain a minimum set of info that is needed for mlc_serve and this engine config was my next step.
Do you mind if I take over? If that is okay, we can merge this quickly and iterate or combine it into my PR.

serve/mlc_serve/engine/base.py Outdated Show resolved Hide resolved
@jroesch jroesch force-pushed the jroesch/cto-53-expose-information-about-tokenization-from-mlc-serve-to-ollm branch from 8e50529 to 994accc Compare November 16, 2023 05:53
@jroesch jroesch force-pushed the jroesch/cto-53-expose-information-about-tokenization-from-mlc-serve-to-ollm branch from 83829ad to 04f76fd Compare November 16, 2023 22:43
@jroesch jroesch changed the title [WIP] Changes to support input validation to match OpenAI behavior. Changes to support input validation to match OpenAI behavior. Nov 16, 2023
@jroesch
Copy link
Member Author

jroesch commented Nov 16, 2023

Okay this is now cleaned up on top of Sung's previous changes.

@@ -102,7 +113,7 @@ def wait_for_request(self, timeout_seconds=None) -> bool:
)

def has_pending_requests(self) -> bool:
return self.queue or self.current_batch
return self.queue or self.current_batch or self.cancelled_requests
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be why cancelation was taking a while, we wont' actually step forward to cancel requests if we don't submit new work to the queue or the current batch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@@ -192,14 +194,18 @@ def step(self) -> InferenceStepResult:
def _is_ready_to_serve(self) -> bool:
return self.worker_process is not None and self.worker_process.is_alive()

def _get_new_request_state(self, request: Request) -> RequestState:
def _get_new_request_state(self, request: Request) -> Optional[RequestState]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The return value is never Optional, is it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes this is a piece of bit rot from my previous attempt. Let me change that.

@@ -102,7 +113,7 @@ def wait_for_request(self, timeout_seconds=None) -> bool:
)

def has_pending_requests(self) -> bool:
return self.queue or self.current_batch
return self.queue or self.current_batch or self.cancelled_requests
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

@masahi masahi merged commit 43c8ebb into batch-serving Nov 16, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants