-
-
Couldn't load subscription status.
- Fork 10.9k
Open
Labels
Description
Your current environment
The output of `python collect_env.py`
How would you like to use vllm
I launch a server with the following command to serving a Qwen2.5-VL-7B model finetued for seqence classification. (this model replaced the lm_head with a 2 classes score_head)
The launch command is :
vllm serve --model=//video_classification/qwenvl_7b_video_cls/v5-20251011-121851/2340_vllm_format --served_model_name Qwen2.5-7B-shenhe --task=classify --port=8080 --tensor-parallel-size=2
I don't know how to request the server with the openAI sdk.
I use the code snnipet showed below which works well with pure text, but it got 400 bad request when I put the video url into the prompt
this works well:
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright contributors to the vLLM project
"""Example Python client for classification API using vLLM API server
NOTE:
start a supported classification model server with `vllm serve`, e.g.
vllm serve jason9693/Qwen2.5-1.5B-apeach
"""
import argparse
import pprint
import requests
def post_http_request(payload: dict, api_url: str) -> requests.Response:
headers = {"User-Agent": "Test Client"}
response = requests.post(api_url, headers=headers, json=payload)
return response
def parse_args():
parse = argparse.ArgumentParser()
parse.add_argument("--host", type=str, default="localhost")
parse.add_argument("--port", type=int, default=8000)
parse.add_argument("--model", type=str, default="jason9693/Qwen2.5-1.5B-apeach")
return parse.parse_args()
def main(args):
host = args.host
port = args.port
model_name = args.model
api_url = f"http://{host}:{port}/classify"
prompts = [
"Hello, my name is",
"The president of the United States is",
"The capital of France is",
"The future of AI is",
]
payload = {
"model": model_name,
"input": prompts,
}
classify_response = post_http_request(payload=payload, api_url=api_url)
pprint.pprint(classify_response.json())
if __name__ == "__main__":
args = parse_args()
main(args)
but if I replace the prompts with multimodal data, the server doesn't work.
video_url = "https://js-ad.a.yximgs.com/bs2/ad_nieuwland-material/t2i2v/videos/3525031242883943515-140276939618048_24597237897733_v0_1759927515165406_3.mp4"
prompts = [
{"role": "user", "content": [
{"type": "text", "text": "你是一个专业的视频质量分析师,请你仔细判断下方提供的视频是否存在质量问题\n质量问题包括但不限于:\n1.画面质量差,画面模糊,亮度闪烁\n2.画面中文字存在模糊问题\n3.视频画面不符合真实物理逻辑,例如凭空产生的人物肢体、头像、手指手臂数量不对,腿部不自然等问题\n4.画面运动不符合物理规律,例如凭空产生的物体,画面卡顿、晃动、抖动、跳动等\n\n如果视频存在问题请返回0,如果视频不存在问题请返回1。\n## 视频内容如下\n"},
{"type": "video", "video": f"{video_url}"},
]
}
]
Before submitting a new issue...
- Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.