Crash with KeyError due to missing key in `states_to_token_maps`

### Describe the issue as clearly as possible:

The `states_to_token_maps` generated in `RegexFSM.__init__` is wrong for this specific regex. It is 100% reproducible.

Notice that no key `9` exists in the FSM, which causes it to crash during generation with a `KeyError`.

```py
regex_string = '```\n(Program\\.cs\n)?```\n'

self.states_to_token_maps = {
 0: {63: 1, 4686: 2, 10252: 3},
 1: {63: 2, 4686: 3},
 2: {63: 3},
 3: {185: 4},
 4: {47: 5, 63: 6, 1426: 11, 4686: 7, 5959: 10, 10252: 8, 16097: 15},
 5: {81: 10, 295: 11, 12483: 12},
 6: {63: 7, 4686: 8},
 7: {63: 8},
 8: {185: 9},
 10: {78: 11, 493: 12, 18596: 15},
 11: {70: 12, 877: 13, 1644: 15, 16795: 14},
 12: {81: 13, 401: 14, 3477: 15},
 13: {64: 14, 302: 15},
 14: {76: 15},
 15: {13: 16},
 16: {66: 17, 5494: 18},
 17: {82: 18},
 18: {185: 19},
 19: {63: 6, 4686: 7, 10252: 8},
}
```

### Steps/code to reproduce the bug:

Use the regex as a constraint on a prompt like:

Please list the all the filenames from the code block below in the same order. 
Write your answer as a code block. Do not explain, do not apologize.
Write only the code block and nothing else.

```
Program.cs
```


### Expected result:

```txt
Program.cs
```

### Error message:

Exception while running such a query with vLLM 0.3.0 serving, but the actual problem does not depend on vLLM at all:

```
INFO:     192.168.1.70:61435 - "POST /generate HTTP/1.1" 500 Internal Server Error
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 29, in _raise_exception_on_finish
    task.result()
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 409, in run_engine_loop
    has_requests_in_progress = await self.engine_step()
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 388, in engine_step
    request_outputs = await self.engine.step_async()
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 189, in step_async
    all_outputs = await self._run_workers_async(
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/engine/async_llm_engine.py", line 276, in _run_workers_async
    all_outputs = await asyncio.gather(*coros)
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/worker/worker.py", line 213, in execute_model
    output = self.model_runner.execute_model(seq_group_metadata_list,
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/worker/model_runner.py", line 542, in execute_model
    output = self.model.sample(
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/model_executor/models/llama.py", line 314, in sample
    next_tokens = self.sampler(self.lm_head.weight, hidden_states,
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/viktor/env/outlines/lib/python3.10/site-packages/vllm/model_executor/layers/sampler.py", line 74, in forward
    logits = _apply_logits_processors(logits, sampling_metadata)
  File "/home/viktor/dep/outlines-contrib/outlines/serve/vllm.py", line 35, in _patched_apply_logits_processors
    logits_row = logits_processor(token_ids, logits_row)
  File "/home/viktor/dep/outlines-contrib/outlines/serve/vllm.py", line 140, in __call__
    state = self.fsm.get_state_by_token_ids(tuple(input_ids))
  File "/home/viktor/dep/outlines-contrib/outlines/serve/vllm.py", line 93, in get_state_by_token_ids
    new_state = self.next_state(prev_state, last_token)
  File "/home/viktor/dep/outlines-contrib/outlines/fsm/fsm.py", line 178, in next_state
    last_token_to_end_state = self.states_to_token_maps[state]
KeyError: 9
```

### Outlines/Python version information:

Version information
<details>
```
Package                   Version
------------------------- ----------
aiohttp                   3.9.1
aiosignal                 1.3.1
annotated-types           0.6.0
anyio                     4.2.0
async-timeout             4.0.3
attrs                     23.1.0
certifi                   2023.11.17
charset-normalizer        3.3.2
cloudpickle               3.0.0
colorama                  0.4.6
coverage                  7.4.0
diskcache                 5.6.3
distro                    1.8.0
exceptiongroup            1.2.0
filelock                  3.13.1
frozenlist                1.4.1
fsspec                    2023.12.2
h11                       0.14.0
httpcore                  1.0.2
httpx                     0.25.2
huggingface-hub           0.20.2
idna                      3.6
interegular               0.3.3
Jinja2                    3.1.3
joblib                    1.3.2
jsonschema                4.21.1
jsonschema-specifications 2023.12.1
lark                      1.1.9
llvmlite                  0.42.0
lxml                      5.0.0
MarkupSafe                2.1.3
mpmath                    1.3.0
multidict                 6.0.4
nest-asyncio              1.6.0
networkx                  3.2.1
numba                     0.59.0
numpy                     1.26.2
openai                    1.5.0
outlines                  0.0.25
packaging                 23.2
pip                       23.3.2
protobuf                  4.25.1
pydantic                  2.5.2
pydantic_core             2.14.5
PyYAML                    6.0.1
referencing               0.33.0
regex                     2023.10.3
requests                  2.31.0
rpds-py                   0.17.1
safetensors               0.4.1
scipy                     1.12.0
sentencepiece             0.1.99
setuptools                68.2.0
sniffio                   1.3.0
sympy                     1.12
tokenizers                0.15.0
toml                      0.10.2
torch                     2.2.0
tqdm                      4.66.1
transformers              4.36.2
typing_extensions         4.9.0
urllib3                   2.1.0
vllm-client               0.2.7.2
wheel                     0.41.2
yarl                      1.9.4
```
</details>


### Context for the issue:

I've just started using constrained generation with vLLM based on `outlines.serve.vllm`. Found this issue while working on #539, but this issue is unrelated to the vLLM adapter, therefore created this new ticket.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Crash with KeyError due to missing key in `states_to_token_maps` #605

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Crash with KeyError due to missing key in states_to_token_maps #605

Description

Describe the issue as clearly as possible:

Steps/code to reproduce the bug:

Expected result:

Error message:

Outlines/Python version information:

Context for the issue:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Crash with KeyError due to missing key in `states_to_token_maps` #605