[Usage]: how to request a qwen2.5-VL-7B classify model served by vllm using openai SDK?

### Your current environment

```text
The output of `python collect_env.py`
```


### How would you like to use vllm

I launch a server with the following command to serving a Qwen2.5-VL-7B model finetued for seqence classification. (this model replaced the lm_head with a 2 classes score_head)

The launch command is :
```
vllm serve --model=//video_classification/qwenvl_7b_video_cls/v5-20251011-121851/2340_vllm_format --served_model_name Qwen2.5-7B-shenhe --task=classify --port=8080 --tensor-parallel-size=2
```

I don't know how to request the server with the openAI sdk.
I use the code snnipet showed below which works well with pure text, but it got 400 bad request when I put the video url into the prompt

this works well:
```
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Example Python client for classification API using vLLM API server
NOTE:
    start a supported classification model server with `vllm serve`, e.g.
    vllm serve jason9693/Qwen2.5-1.5B-apeach
"""

import argparse
import pprint

import requests


def post_http_request(payload: dict, api_url: str) -> requests.Response:
    headers = {"User-Agent": "Test Client"}
    response = requests.post(api_url, headers=headers, json=payload)
    return response


def parse_args():
    parse = argparse.ArgumentParser()
    parse.add_argument("--host", type=str, default="localhost")
    parse.add_argument("--port", type=int, default=8000)
    parse.add_argument("--model", type=str, default="jason9693/Qwen2.5-1.5B-apeach")
    return parse.parse_args()


def main(args):
    host = args.host
    port = args.port
    model_name = args.model

    api_url = f"http://{host}:{port}/classify"
    prompts = [
        "Hello, my name is",
        "The president of the United States is",
        "The capital of France is",
        "The future of AI is",
    ]

    payload = {
        "model": model_name,
        "input": prompts,
    }

    classify_response = post_http_request(payload=payload, api_url=api_url)
    pprint.pprint(classify_response.json())


if __name__ == "__main__":
    args = parse_args()
    main(args)
```

but if I replace the prompts with multimodal data, the server doesn't work.
```
video_url =  "https://js-ad.a.yximgs.com/bs2/ad_nieuwland-material/t2i2v/videos/3525031242883943515-140276939618048_24597237897733_v0_1759927515165406_3.mp4"

    prompts =  [
        {"role": "user", "content": [
                {"type": "text", "text": "你是一个专业的视频质量分析师，请你仔细判断下方提供的视频是否存在质量问题\n质量问题包括但不限于：\n1.画面质量差,画面模糊，亮度闪烁\n2.画面中文字存在模糊问题\n3.视频画面不符合真实物理逻辑，例如凭空产生的人物肢体、头像、手指手臂数量不对，腿部不自然等问题\n4.画面运动不符合物理规律，例如凭空产生的物体，画面卡顿、晃动、抖动、跳动等\n\n如果视频存在问题请返回0，如果视频不存在问题请返回1。\n## 视频内容如下\n"},
                {"type": "video", "video": f"{video_url}"},
            ]
        }
    ]

```


### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Usage]: how to request a qwen2.5-VL-7B classify model served by vllm using openai SDK? #27413

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Uh oh!

[Usage]: how to request a qwen2.5-VL-7B classify model served by vllm using openai SDK? #27413

Description

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions