Changes to support input validation to match OpenAI behavior. #65

jroesch · 2023-11-15T10:01:45Z

This PR exposes a token validation callback per-request enabling us to validate requests-post tokenization. We will mark RequestState's as invalid and then effectively cancel them when we go to process them in the worker. This will result in an error returned to the caller of the generation API.

We will use exception handling to trigger a validation error and return it to the user internally.

cc @masahi @sunggg

serve/mlc_serve/engine/staging_engine_worker.py

sunggg

Thank you for the PR, @jroesch! This looks good to me.
I've just finished re-organizing the compile-time information dump (e.g., no more hf config dependency under models/) and maintain a minimum set of info that is needed for mlc_serve and this engine config was my next step.
Do you mind if I take over? If that is okay, we can merge this quickly and iterate or combine it into my PR.

serve/mlc_serve/engine/base.py

jroesch · 2023-11-16T22:44:02Z

Okay this is now cleaned up on top of Sung's previous changes.

jroesch · 2023-11-16T22:58:16Z

serve/mlc_serve/engine/staging_engine_worker.py

@@ -102,7 +113,7 @@ def wait_for_request(self, timeout_seconds=None) -> bool:
            )

    def has_pending_requests(self) -> bool:
-        return self.queue or self.current_batch
+        return self.queue or self.current_batch or self.cancelled_requests


This might be why cancelation was taking a while, we wont' actually step forward to cancel requests if we don't submit new work to the queue or the current batch.

Makes sense.

masahi · 2023-11-16T23:20:30Z

serve/mlc_serve/engine/staging_engine.py

@@ -192,14 +194,18 @@ def step(self) -> InferenceStepResult:
    def _is_ready_to_serve(self) -> bool:
        return self.worker_process is not None and self.worker_process.is_alive()

-    def _get_new_request_state(self, request: Request) -> RequestState:
+    def _get_new_request_state(self, request: Request) -> Optional[RequestState]:


The return value is never Optional, is it?

Ah yes this is a piece of bit rot from my previous attempt. Let me change that.

masahi · 2023-11-16T23:22:37Z

serve/mlc_serve/engine/staging_engine_worker.py

@@ -102,7 +113,7 @@ def wait_for_request(self, timeout_seconds=None) -> bool:
            )

    def has_pending_requests(self) -> bool:
-        return self.queue or self.current_batch
+        return self.queue or self.current_batch or self.cancelled_requests


Makes sense.

elvin-n suggested changes Nov 15, 2023

View reviewed changes

serve/mlc_serve/engine/staging_engine_worker.py Outdated Show resolved Hide resolved

sunggg reviewed Nov 15, 2023

View reviewed changes

serve/mlc_serve/engine/base.py Outdated Show resolved Hide resolved

This was referenced Nov 15, 2023

Add handling of max_gen_len from mlc-llm chat_config #64

Closed

[Refactor] Clean-up Management of Model/Artifact/Engine Info #66

Merged

jroesch force-pushed the jroesch/cto-53-expose-information-about-tokenization-from-mlc-serve-to-ollm branch from 8e50529 to 994accc Compare November 16, 2023 05:53

Add the ability to control token validation

04f76fd

jroesch force-pushed the jroesch/cto-53-expose-information-about-tokenization-from-mlc-serve-to-ollm branch from 83829ad to 04f76fd Compare November 16, 2023 22:43

jroesch changed the title ~~[WIP] Changes to support input validation to match OpenAI behavior.~~ Changes to support input validation to match OpenAI behavior. Nov 16, 2023

Remove debugging

cb41ed1

jroesch commented Nov 16, 2023

View reviewed changes

Fix

5e8e1a5

masahi approved these changes Nov 16, 2023

View reviewed changes

Address issue with staging engine

820f86f

masahi approved these changes Nov 16, 2023

View reviewed changes

masahi merged commit 43c8ebb into batch-serving Nov 16, 2023
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changes to support input validation to match OpenAI behavior. #65

Changes to support input validation to match OpenAI behavior. #65

jroesch commented Nov 15, 2023 •

edited

Loading

sunggg left a comment •

edited

Loading

jroesch commented Nov 16, 2023

jroesch Nov 16, 2023

masahi Nov 16, 2023

masahi Nov 16, 2023

jroesch Nov 16, 2023

masahi Nov 16, 2023

Changes to support input validation to match OpenAI behavior. #65

Changes to support input validation to match OpenAI behavior. #65

Conversation

jroesch commented Nov 15, 2023 • edited Loading

sunggg left a comment • edited Loading

Choose a reason for hiding this comment

jroesch commented Nov 16, 2023

jroesch Nov 16, 2023

Choose a reason for hiding this comment

masahi Nov 16, 2023

Choose a reason for hiding this comment

masahi Nov 16, 2023

Choose a reason for hiding this comment

jroesch Nov 16, 2023

Choose a reason for hiding this comment

masahi Nov 16, 2023

Choose a reason for hiding this comment

jroesch commented Nov 15, 2023 •

edited

Loading

sunggg left a comment •

edited

Loading