Your current environment
The output of `python collect_env.py`
Your output of above commands here
🐛 Describe the bug
In upstream vllm latest code, them update a new attribute pooling_params: Optional[PoolingParams] in the @dataclass CachedRequestState in vllm/vllm/v1/worker/gpu_input_batch.py. Need a pull request to add sampling_params = new_req_data.sampling_params and pass the correct variable to initialize CachedRequestState. Also, in the ModelRunnerOutputneed also add the pooler_output=pooler_output, positional argument.