-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
最新取用的代码,运行api.py(或者是webui.py)报错,错误信息均是:ImportError: cannot import name 'MultiModalData' from 'vllm.sequence' (/usr/local/lib/python3.10/dist-packages/vllm/sequence.py) #3645
Comments
项目要求 vllm 最低 0.4.0 |
感谢大佬指点。 |
我使用pip install vllm安装会显示RuntimeError: No suitable kernel. h_in=8 h_out=18944 dtype=Float out_dtype=BFloat16,这是在我使用api推理bf16、lora训练qwen2-7B后报错的信息。 然后我尝试找到vllm源码并新增f(in_T, out_T, W_T, narrow, 18944) \,用源码编译后报错cannot import name 'MultiModalData' from 'vllm.sequence'。难道是一开始使用bf16训练就是错误的吗?请告知,谢谢。 |
vllm 0.5.0 python 3.10 会报错 |
vllm 0.5.0 python 3.11 也会报错 |
vllm 0.5.0 Python 3.9.18 也会报错 |
把vllm的版本从0.5.0降低到0.4.3就可以啦,亲测! |
0.4.3 适配qwen2了吗?如果一定要用0.5.x版本 是否有解决办法呢? |
Hi there, I'm strugging on vllm multimodaldata as well. I have tried vllm from 0.4.3 till 0.5.4, all not working. I am using python 3.10 and llama-factory=0.8.0 |
0.5.x 版本已经不需要用MultiModalData了。 llm.generate的参数里"multi_modal_data"可以这样写:
|
之前的版本在dcu进行训练也会出现这个问题 |
最新的版本更新到0.5就不会了 |
@hiyougavllm 0.6.3 还有这个问题 |
vllm 0.6.3 not has ImagePixelData class and llmtuner need update |
Reminder
Reproduction
(base) root@I19c2837ff800901ccf:/hy-tmp/LLaMA-Factory-main/src# CUDA_VISIBLE_DEVICES=0,1,2,3 python3.10 api.py \
/usr/local/lib/python3.10/dist-packages/bitsandbytes/cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers, 8-bit multiplication, and GPU quantization are unavailable.
warn("The installed version of bitsandbytes was compiled without GPU support. "
/usr/local/lib/python3.10/dist-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol: cadam32bit_grad_fp32
Traceback (most recent call last):
File "/hy-tmp/LLaMA-Factory-main/src/api.py", line 5, in
from llmtuner.api.app import create_app
File "/hy-tmp/LLaMA-Factory-main/src/llmtuner/api/app.py", line 5, in
from ..chat import ChatModel
File "/hy-tmp/LLaMA-Factory-main/src/llmtuner/chat/init.py", line 2, in
from .chat_model import ChatModel
File "/hy-tmp/LLaMA-Factory-main/src/llmtuner/chat/chat_model.py", line 8, in
from .vllm_engine import VllmEngine
File "/hy-tmp/LLaMA-Factory-main/src/llmtuner/chat/vllm_engine.py", line 14, in
from vllm.sequence import MultiModalData
ImportError: cannot import name 'MultiModalData' from 'vllm.sequence' (/usr/local/lib/python3.10/dist-packages/vllm/sequence.py)
Expected behavior
多卡通过webui.py运行Qwen1.5-72B-Chat模型。
非常奇怪的是:脚本里面并没有加上 _--infer_backend vllm _,但是为什么报错信息是错在了上。
备注:这个是在无卡模式打印出来的消息,实际上4卡运行的时候,错误信息差不多,只少了withoutGPU那段(有卡模式是有GPU的)
System Info
(base) root@I19c2837ff800901ccf:/hy-tmp/LLaMA-Factory-main/src# python3.10 -m pip list
Package Version
accelerate 0.28.0
addict 2.4.0
aiofiles 23.2.1
aiohttp 3.9.3
aiosignal 1.3.1
aliyun-python-sdk-core 2.15.0
aliyun-python-sdk-kms 2.16.2
altair 5.2.0
annotated-types 0.6.0
anyio 4.3.0
async-timeout 4.0.3
attrs 23.2.0
auto_gptq 0.7.1
bitsandbytes 0.43.0
certifi 2019.11.28
cffi 1.16.0
chardet 3.0.4
charset-normalizer 3.3.2
click 8.1.7
cloudpickle 3.0.0
contourpy 1.2.0
crcmod 1.7
cryptography 42.0.5
cupy-cuda12x 12.1.0
cycler 0.12.1
datasets 2.18.0
dbus-python 1.2.16
deepspeed 0.14.0
dill 0.3.8
diskcache 5.6.3
distro 1.4.0
distro-info 0.23ubuntu1
docstring_parser 0.16
einops 0.7.0
exceptiongroup 1.2.0
fastapi 0.110.0
fastrlock 0.8.2
ffmpy 0.3.2
filelock 3.13.3
fire 0.6.0
fonttools 4.50.0
frozenlist 1.4.1
fsspec 2024.2.0
galore-torch 1.0
gast 0.5.4
gekko 1.0.7
gradio 4.10.0
gradio_client 0.7.3
h11 0.14.0
hjson 3.1.0
httpcore 1.0.4
httptools 0.6.1
httpx 0.27.0
huggingface-hub 0.22.0
idna 2.8
importlib_metadata 7.1.0
importlib_resources 6.4.0
interegular 0.3.3
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
jsonschema 4.21.1
jsonschema-specifications 2023.12.1
kiwisolver 1.4.5
lark 1.1.9
llvmlite 0.42.0
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.8.3
mdurl 0.1.2
modelscope 1.13.3
mpmath 1.3.0
msgpack 1.0.8
multidict 6.0.5
multiprocess 0.70.16
nest-asyncio 1.6.0
networkx 3.2.1
ninja 1.11.1.1
numba 0.59.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.18.1
nvidia-nvjitlink-cu12 12.4.99
nvidia-nvtx-cu12 12.1.105
orjson 3.9.15
oss2 2.18.4
outlines 0.0.37
packaging 24.0
pandas 2.2.1
peft 0.10.0
pillow 10.2.0
pip 24.0
platformdirs 4.2.0
prometheus_client 0.20.0
protobuf 5.26.0
psutil 5.9.8
py-cpuinfo 9.0.0
pyarrow 15.0.2
pyarrow-hotfix 0.6
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.4
pydantic_core 2.16.3
pydub 0.25.1
Pygments 2.17.2
PyGObject 3.36.0
pynvml 11.5.0
pyparsing 3.1.2
python-apt 2.0.1+ubuntu0.20.4.1
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-multipart 0.0.9
pytz 2024.1
PyYAML 6.0.1
ray 2.10.0
referencing 0.34.0
regex 2023.12.25
requests 2.31.0
requests-unixsocket 0.2.0
rich 13.7.1
rouge 1.0.1
rpds-py 0.18.0
safetensors 0.4.2
scipy 1.12.0
semantic-version 2.10.0
sentencepiece 0.2.0
setuptools 69.2.0
shellingham 1.5.4
shtab 1.7.1
simplejson 3.19.2
six 1.14.0
sniffio 1.3.1
sortedcontainers 2.4.0
sse-starlette 2.0.0
ssh-import-id 5.10
starlette 0.36.3
sympy 1.12
termcolor 2.4.0
tokenizers 0.15.2
tomli 2.0.1
tomlkit 0.12.0
toolz 0.12.1
torch 2.1.2
tqdm 4.66.2
transformers 4.39.1
triton 2.1.0
trl 0.8.1
typer 0.12.3
typing_extensions 4.10.0
tyro 0.7.3
tzdata 2024.1
unattended-upgrades 0.1
urllib3 2.2.1
uvicorn 0.29.0
uvloop 0.19.0
vllm 0.3.3
watchfiles 0.21.0
websockets 11.0.3
wheel 0.34.2
xformers 0.0.23.post1
xxhash 3.4.1
yapf 0.40.2
yarl 1.9.4
zipp 3.18.1
Others
No response
The text was updated successfully, but these errors were encountered: