You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"""Validate sampling params for this logits processor.
261
+
262
+
Raise ValueError for invalid ones.
263
+
"""
264
+
return None
265
+
258
266
```
259
267
260
268
A vLLM logits processor must subclass `LogitsProcessor` and define (at minimum) the following methods:
@@ -279,6 +287,10 @@ A vLLM logits processor must subclass `LogitsProcessor` and define (at minimum)
279
287
* Use the `BatchUpdate` members to update logits processor internal state
280
288
***Note:** batch update data structure may be `None`, signaling no change to the batch constituents. In this case, the LogitsProcessor might still want to update its state based on the updated `output_token_ids` lists that it could have retained when they were added.
* Raise `ValueError` if `SamplingParams` has invalid arguments (especially custom arguments) used by logits processor.
292
+
* When request is sent to entrypoint, `validate_params()` will validate `SamplingParams` and refuse request with invalid arguments.
293
+
282
294
### `BatchUpdate` data structure
283
295
284
296
The `BatchUpdate` abstraction models the persistent batch as a list of requests, supporting the following operations to change batch state (note that the order in which the operations are mentioned below reflects the order in which they should be processed in `update_state()`):
Copy file name to clipboardExpand all lines: docs/features/custom_arguments.md
+3Lines changed: 3 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,9 @@ You can use vLLM *custom arguments* to pass in arguments which are not part of t
4
4
5
5
Custom arguments can be useful if, for example, you want to use a [custom logits processor](./custom_logitsprocs.md) without modifying the vLLM source code.
6
6
7
+
!!! note
8
+
Make sure your custom logits processor have implemented `validate_params` for custom arguments. Otherwise invalid custom arguments can cause unexpected behaviour.
9
+
7
10
## Offline Custom Arguments
8
11
9
12
Custom arguments passed to `SamplingParams.extra_args` as a `dict` will be visible to any code which has access to `SamplingParams`:
* Raise `ValueError` if `SamplingParams` has invalid arguments (especially custom arguments) used by logits processor.
23
+
* When request is sent to entrypoint, `validate_params()` will validate `SamplingParams` and refuse request with invalid arguments.
24
+
***Note:** it's important to implement `validate_params()` to prevent invalid parameters for custom logits processor. Otherwise requests with invalid parameters can cause unexpected behaviour in custom logits processor.
While request-level logits processors are explicitly *not* supported in the vLLM engine, vLLM *does* provide a convenient process to wrap an existing `Callable` request-level logits processor and create a batch-level logits processor that is compatible with vLLM. The `Callable` must conform to the type annotation above; if your request-level logits processor has a different interface, then in order to wrap it, you may need to modify it or implement an additional wrapper layer to comply with the interface specification above.
182
197
183
-
You can wrap the request-level logits processor by subclassing `AdapterLogitsProcessor` as shown in the example below (in this example, `DummyPerReqLogitsProcessor` is a stand-in for your request-level logits processor which needs to be wrapped.) Override `AdapterLogitsProcessor.is_argmax_invariant(self)` to accurately reflect whether your request-level logits processor may impact which token has the highest-value logit. Override `AdapterLogitsProcessor.new_req_logits_processor(self,params)` to create a new request-level logits processor instance from a `SamplingParams` instance:
198
+
You can wrap the request-level logits processor by subclassing `AdapterLogitsProcessor` as shown in the example below (in this example, `DummyPerReqLogitsProcessor` is a stand-in for your request-level logits processor which needs to be wrapped.):
199
+
200
+
* Override `AdapterLogitsProcessor.validate_params(cls,params)` to validate request's sampling parameters.
201
+
202
+
* Override `AdapterLogitsProcessor.is_argmax_invariant(self)` to accurately reflect whether your request-level logits processor may impact which token has the highest-value logit.
203
+
204
+
* Override `AdapterLogitsProcessor.new_req_logits_processor(self,params)` to create a new request-level logits processor instance from a `SamplingParams` instance:
184
205
185
206
??? code "Example of Wrapping a Request-Level Logits Processor"
186
207
@@ -220,6 +241,16 @@ You can wrap the request-level logits processor by subclassing `AdapterLogitsPro
220
241
"""Example of wrapping a fake request-level logit processor to create a
221
242
batch-level logits processor"""
222
243
244
+
@classmethod
245
+
def validate_params(cls, params: SamplingParams):
246
+
target_token: Any | None = params.extra_args and params.extra_args.get(
247
+
"target_token"
248
+
)
249
+
if target_token is not None and not isinstance(target_token, int):
250
+
raise ValueError(
251
+
f"target_token value {target_token} is not int"
252
+
)
253
+
223
254
def is_argmax_invariant(self) -> bool:
224
255
return False
225
256
@@ -240,18 +271,11 @@ You can wrap the request-level logits processor by subclassing `AdapterLogitsPro
240
271
Returns:
241
272
`Callable` request logits processor, or None
242
273
"""
243
-
target_token: Optional[Any] = params.extra_args and params.extra_args.get(
274
+
target_token: Any | None = params.extra_args and params.extra_args.get(
244
275
"target_token"
245
276
)
246
277
if target_token is None:
247
278
return None
248
-
if not isinstance(target_token, int):
249
-
logger.warning(
250
-
"target_token value %s is not int; not applying logits"
0 commit comments