Skip to content

4*4090 显卡部署glm4-9b 使用dify 的api调用报错 #315

Open
@he498

Description

@he498

提交前必须检查以下项目 | The following items must be checked before submission

  • 请确保使用的是仓库最新代码(git pull),一些问题已被解决和修复。 | Make sure you are using the latest code from the repository (git pull), some issues have already been addressed and fixed.
  • 我已阅读项目文档FAQ章节并且已在Issue中对问题进行了搜索,没有找到相似问题和解决方案 | I have searched the existing issues / discussions

问题类型 | Type of problem

模型推理和部署 | Model inference and deployment

操作系统 | Operating system

Linux

详细描述问题 | Detailed description of the problem

我的硬件环境是 4 * 4090 ,cuda 12.1
在使用dify的外部api接口的时候报错。使用的dify的这个流式接口:/v1/chat-messages。
用本项目代码中的transformer的方式部署。
image

Dependencies

Package Version


accelerate 0.31.0
aiohttp 3.9.5
aiosignal 1.3.1
annotated-types 0.7.0
antlr4-python3-runtime 4.9.3
anyio 4.4.0
attrs 23.2.0
backoff 2.2.1
beautifulsoup4 4.12.3
bitsandbytes 0.42.0
certifi 2024.6.2
cffi 1.16.0
chardet 5.2.0
charset-normalizer 3.3.2
click 8.1.7
coloredlogs 15.0.1
contourpy 1.2.1
cpm-kernels 1.0.11
cryptography 42.0.8
cycler 0.12.1
dataclasses-json 0.6.7
dataclasses-json-speakeasy 0.5.11
Deprecated 1.2.14
distro 1.9.0
dnspython 2.6.1
effdet 0.4.1
einops 0.8.0
email_validator 2.1.2
emoji 2.12.1
et-xmlfile 1.1.0
fastapi 0.111.0
fastapi-cli 0.0.4
filelock 3.15.1
filetype 1.2.0
flatbuffers 24.3.25
fonttools 4.53.0
frozenlist 1.4.1
fsspec 2024.6.0
greenlet 3.0.3
h11 0.14.0
httpcore 1.0.5
httptools 0.6.1
httpx 0.27.0
huggingface-hub 0.23.4
humanfriendly 10.0
idna 3.7
iopath 0.1.10
Jinja2 3.1.4
joblib 1.4.2
jsonpatch 1.33
jsonpath-python 1.0.6
jsonpointer 3.0.0
kiwisolver 1.4.5
langchain 0.2.5
langchain-community 0.2.5
langchain-core 0.2.8
langchain-text-splitters 0.2.1
langdetect 1.0.9
langsmith 0.1.78
layoutparser 0.3.4
loguru 0.7.2
lxml 5.2.2
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.5
marshmallow 3.21.3
matplotlib 3.9.0
mdurl 0.1.2
mpmath 1.3.0
msg-parser 1.2.0
multidict 6.0.5
mypy-extensions 1.0.0
networkx 3.3
nltk 3.8.1
numpy 1.26.4
nvidia-cublas-cu12 12.1.3.1
nvidia-cuda-cupti-cu12 12.1.105
nvidia-cuda-nvrtc-cu12 12.1.105
nvidia-cuda-runtime-cu12 12.1.105
nvidia-cudnn-cu12 8.9.2.26
nvidia-cufft-cu12 11.0.2.54
nvidia-curand-cu12 10.3.2.106
nvidia-cusolver-cu12 11.4.5.107
nvidia-cusparse-cu12 12.1.0.106
nvidia-nccl-cu12 2.20.5
nvidia-nvjitlink-cu12 12.5.40
nvidia-nvtx-cu12 12.1.105
olefile 0.47
omegaconf 2.3.0
onnx 1.16.1
onnxruntime 1.15.1
openai 1.34.0
opencv-python 4.10.0.84
openparse 0.5.7
openpyxl 3.1.4
orjson 3.10.5
packaging 24.1
pandas 2.2.2
pdf2image 1.17.0
pdfminer.six 20231228
pdfplumber 0.11.1
peft 0.11.1
pikepdf 9.0.0
pillow 10.3.0
pillow_heif 0.16.0
pip 24.0
portalocker 2.8.2
protobuf 5.27.1
psutil 5.9.8
pyclipper 1.3.0.post5
pycocotools 2.0.8
pycparser 2.22
pydantic 2.7.4
pydantic_core 2.18.4
Pygments 2.18.0
PyMuPDF 1.24.5
PyMuPDFb 1.24.3
pypandoc 1.13
pyparsing 3.1.2
pypdf 4.2.0
pypdfium2 4.30.0
pytesseract 0.3.10
python-dateutil 2.9.0.post0
python-docx 1.1.2
python-dotenv 1.0.0
python-iso639 2024.4.27
python-magic 0.4.27
python-multipart 0.0.9
python-pptx 0.6.23
pytz 2024.1
PyYAML 6.0.1
rapidfuzz 3.9.3
rapidocr-onnxruntime 1.3.22
regex 2024.5.15
requests 2.32.3
rich 13.7.1
safetensors 0.4.3
scikit-learn 1.5.0
scipy 1.13.1
sentence-transformers 3.0.1
sentencepiece 0.2.0
setuptools 69.5.1
shapely 2.0.4
shellingham 1.5.4
six 1.16.0
sniffio 1.3.1
soupsieve 2.5
SQLAlchemy 2.0.30
sse-starlette 2.1.2
starlette 0.37.2
starlette-context 0.3.6
sympy 1.12.1
tabulate 0.9.0
tenacity 8.4.1
threadpoolctl 3.5.0
tiktoken 0.7.0
timm 1.0.3
tokenizers 0.19.1
torch 2.3.1
torchvision 0.18.1
tqdm 4.66.4
transformers 4.42.4
transformers-stream-generator 0.0.5
triton 2.3.1
typer 0.12.3
typing_extensions 4.12.2
typing-inspect 0.9.0
tzdata 2024.1
ujson 5.10.0
unstructured 0.13.2
unstructured-client 0.18.0
unstructured-inference 0.7.25
unstructured.pytesseract 0.3.12
urllib3 2.2.2
uvicorn 0.30.1
uvloop 0.19.0
watchfiles 0.22.0
websockets 12.0
wheel 0.43.0
wrapt 1.16.0
xlrd 2.0.1
XlsxWriter 3.2.0
yarl 1.9.4

运行日志或截图 | Runtime logs or screenshots

Exception in thread Thread-4 (generate):
Traceback (most recent call last):
File "/data/conda/aconda3/envs/glm4/lib/python3.11/threading.py", line 1045, in _bootstrap_inner
self.run()
File "/data/conda/aconda3/envs/glm4/lib/python3.11/threading.py", line 982, in run
self._target(*self._args, **self._kwargs)
File "/data/conda/aconda3/envs/glm4/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/data/conda/aconda3/envs/glm4/lib/python3.11/site-packages/transformers/generation/utils.py", line 1914, in generate
result = self._sample(
^^^^^^^^^^^^^
File "/data/conda/aconda3/envs/glm4/lib/python3.11/site-packages/transformers/generation/utils.py", line 2693, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: probability tensor contains either inf, nan or element < 0

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions