Skip to content

Crash with KeyError due to missing key in states_to_token_maps #605

@viktor-ferenczi

Description

@viktor-ferenczi

Describe the issue as clearly as possible:

The states_to_token_maps generated in RegexFSM.__init__ is wrong for this specific regex. It is 100% reproducible.

Notice that no key 9 exists in the FSM, which causes it to crash during generation with a KeyError.

regex_string = '```\n(Program\\.cs\n)?```\n'

self.states_to_token_maps = {
 0: {63: 1, 4686: 2, 10252: 3},
 1: {63: 2, 4686: 3},
 2: {63: 3},
 3: {185: 4},
 4: {47: 5, 63: 6, 1426: 11, 4686: 7, 5959: 10, 10252: 8, 16097: 15},
 5: {81: 10, 295: 11, 12483: 12},
 6: {63: 7, 4686: 8},
 7: {63: 8},
 8: {185: 9},
 10: {78: 11, 493: 12, 18596: 15},
 11: {70: 12, 877: 13, 1644: 15, 16795: 14},
 12: {81: 13, 401: 14, 3477: 15},
 13: {64: 14, 302: 15},
 14: {76: 15},
 15: {13: 16},
 16: {66: 17, 5494: 18},
 17: {82: 18},
 18: {185: 19},
 19: {63: 6, 4686: 7, 10252: 8},
}

Steps/code to reproduce the bug:

Use the regex as a constraint on a prompt like:

Please list the all the filenames from the code block below in the same order.
Write your answer as a code block. Do not explain, do not apologize.
Write only the code block and nothing else.

Program.cs

Expected result:

Program.cs

Error message:

Exception while running such a query with vLLM 0.3.0 serving, but the actual problem does not depend on vLLM at all:

INFO:     192.168.1.70:61435 - "POST /generate HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 29, in _raise_exception_on_finish
    task.result()
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 409, in run_engine_loop
    has_requests_in_progress = await self.engine_step()
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 388, in engine_step
    request_outputs = await self.engine.step_async()
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 189, in step_async
    all_outputs = await self._run_workers_async(
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 276, in _run_workers_async
    all_outputs = await asyncio.gather(*coros)
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/worker/worker.py", line 213, in execute_model
    output = self.model_runner.execute_model(seq_group_metadata_list,
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 542, in execute_model
    output = self.model.sample(
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 314, in sample
    next_tokens = self.sampler(self.lm_head.weight, hidden_states,
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/model_executor/layers/sampler.py", line 74, in forward
    logits = _apply_logits_processors(logits, sampling_metadata)
  File "/home/viktor/dep/outlines-contrib/outlines/serve/vllm.py", line 35, in _patched_apply_logits_processors
    logits_row = logits_processor(token_ids, logits_row)
  File "/home/viktor/dep/outlines-contrib/outlines/serve/vllm.py", line 140, in __call__
    state = self.fsm.get_state_by_token_ids(tuple(input_ids))
  File "/home/viktor/dep/outlines-contrib/outlines/serve/vllm.py", line 93, in get_state_by_token_ids
    new_state = self.next_state(prev_state, last_token)
  File "/home/viktor/dep/outlines-contrib/outlines/fsm/fsm.py", line 178, in next_state
    last_token_to_end_state = self.states_to_token_maps[state]
KeyError: 9

Outlines/Python version information:

Version information

``` Package Version ------------------------- ---------- aiohttp 3.9.1 aiosignal 1.3.1 annotated-types 0.6.0 anyio 4.2.0 async-timeout 4.0.3 attrs 23.1.0 certifi 2023.11.17 charset-normalizer 3.3.2 cloudpickle 3.0.0 colorama 0.4.6 coverage 7.4.0 diskcache 5.6.3 distro 1.8.0 exceptiongroup 1.2.0 filelock 3.13.1 frozenlist 1.4.1 fsspec 2023.12.2 h11 0.14.0 httpcore 1.0.2 httpx 0.25.2 huggingface-hub 0.20.2 idna 3.6 interegular 0.3.3 Jinja2 3.1.3 joblib 1.3.2 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 lark 1.1.9 llvmlite 0.42.0 lxml 5.0.0 MarkupSafe 2.1.3 mpmath 1.3.0 multidict 6.0.4 nest-asyncio 1.6.0 networkx 3.2.1 numba 0.59.0 numpy 1.26.2 openai 1.5.0 outlines 0.0.25 packaging 23.2 pip 23.3.2 protobuf 4.25.1 pydantic 2.5.2 pydantic_core 2.14.5 PyYAML 6.0.1 referencing 0.33.0 regex 2023.10.3 requests 2.31.0 rpds-py 0.17.1 safetensors 0.4.1 scipy 1.12.0 sentencepiece 0.1.99 setuptools 68.2.0 sniffio 1.3.0 sympy 1.12 tokenizers 0.15.0 toml 0.10.2 torch 2.2.0 tqdm 4.66.1 transformers 4.36.2 typing_extensions 4.9.0 urllib3 2.1.0 vllm-client 0.2.7.2 wheel 0.41.2 yarl 1.9.4 ```

Context for the issue:

I've just started using constrained generation with vLLM based on outlines.serve.vllm. Found this issue while working on #539, but this issue is unrelated to the vLLM adapter, therefore created this new ticket.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions