We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xinference, version 0.15.1
docker run -d \ --name xinference \ -p 9997:9997 \ xprobe/xinference:v0.15.1 \ bash -c "xinference-local -H 0.0.0.0 --log-level debug & sleep 20 && xinference launch --model-name bge-reranker-large --model-type rerank --replica 2 && tail -f /dev/null"
执行 docker 启动后日志显示在加载模型,模型之前已经下载过。之前可以启动成功,不知道为啥现在启动不成功了。
日志:
2024-10-08 19:48:58,643 xinference.core.supervisor 278 INFO Xinference supervisor 0.0.0.0:22390 started 2024-10-08 19:48:58,726 xinference.core.worker 278 INFO Starting metrics export server at 0.0.0.0:None 2024-10-08 19:48:58,728 xinference.core.worker 278 INFO Checking metrics export server... 2024-10-08 19:49:00,208 xinference.core.worker 278 INFO Metrics server is started at: http://0.0.0.0:43763 2024-10-08 19:49:00,209 xinference.core.worker 278 INFO Purge cache directory: /root/.xinference/cache 2024-10-08 19:49:00,210 xinference.core.supervisor 278 DEBUG [request 4ed14b44-856b-11ef-9106-00163e788596] Enter add_worker, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>,0.0.0.0:22390, kwargs: 2024-10-08 19:49:00,211 xinference.core.supervisor 278 DEBUG Worker 0.0.0.0:22390 has been added successfully 2024-10-08 19:49:00,211 xinference.core.supervisor 278 DEBUG [request 4ed14b44-856b-11ef-9106-00163e788596] Leave add_worker, elapsed time: 0 s 2024-10-08 19:49:00,211 xinference.core.worker 278 INFO Connected to supervisor as a fresh worker 2024-10-08 19:49:00,221 xinference.core.worker 278 INFO Xinference worker 0.0.0.0:22390 started 2024-10-08 19:49:00,224 xinference.core.supervisor 278 DEBUG Worker 0.0.0.0:22390 resources: {'cpu': ResourceStatus(usage=0.0, total=128, memory_used=37924220928, memory_available=2112589045760, memory_total=2164168032256), 'gpu-0': GPUStatus(mem_total=85899345920, mem_free=85168685056, mem_used=730660864)} 2024-10-08 19:49:03,639 xinference.core.supervisor 278 DEBUG Enter get_status, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>, kwargs: 2024-10-08 19:49:03,639 xinference.core.supervisor 278 DEBUG Leave get_status, elapsed time: 0 s sleep finish 2024-10-08 19:49:04,771 xinference.api.restful_api 143 INFO Starting Xinference at endpoint: http://0.0.0.0:9997 2024-10-08 19:49:04,908 uvicorn.error 143 INFO Uvicorn running on http://0.0.0.0:9997 (Press CTRL+C to quit) Launch model name: bge-reranker-large with kwargs: {} 2024-10-08 19:49:08,025 xinference.core.supervisor 278 DEBUG Enter launch_builtin_model, model_uid: bge-reranker-large, model_name: bge-reranker-large, model_size: , model_format: None, quantization: None, replica: 2, kwargs: {'trust_remote_code': True} 2024-10-08 19:49:08,025 xinference.core.worker 278 DEBUG Enter get_model_count, args: <xinference.core.worker.WorkerActor object at 0x7fe96d83b920>, kwargs: 2024-10-08 19:49:08,026 xinference.core.worker 278 DEBUG Leave get_model_count, elapsed time: 0 s 2024-10-08 19:49:08,026 xinference.core.worker 278 INFO [request 5379d170-856b-11ef-9106-00163e788596] Enter launch_builtin_model, args: <xinference.core.worker.WorkerActor object at 0x7fe96d83b920>, kwargs: model_uid=bge-reranker-large-2-0,model_name=bge-reranker-large,model_size_in_billions=None,model_format=None,quantization=None,model_engine=None,model_type=rerank,n_gpu=auto,request_limits=None,peft_model_config=None,gpu_idx=None,download_hub=None,model_path=None,trust_remote_code=True 2024-10-08 19:49:08,026 xinference.core.worker 278 DEBUG GPU selected: [0] for model bge-reranker-large-2-0 2024-10-08 19:49:12,385 xinference.model.rerank.core 278 DEBUG Rerank model bge-reranker-large found in ModelScope. 2024-10-08 19:50:06,123 xinference.core.supervisor 278 DEBUG [request 761ab046-856b-11ef-9106-00163e788596] Enter list_model_registrations, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>,LLM, kwargs: detailed=True 2024-10-08 19:50:06,219 xinference.core.supervisor 278 DEBUG [request 761ab046-856b-11ef-9106-00163e788596] Leave list_model_registrations, elapsed time: 0 s 2024-10-08 19:50:07,412 xinference.core.supervisor 278 DEBUG [request 76df73f4-856b-11ef-9106-00163e788596] Enter list_model_registrations, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>,rerank, kwargs: detailed=True 2024-10-08 19:50:07,414 xinference.core.supervisor 278 DEBUG [request 76df73f4-856b-11ef-9106-00163e788596] Leave list_model_registrations, elapsed time: 0 s 2024-10-08 19:50:12,033 xinference.core.supervisor 278 DEBUG [request 79a09a82-856b-11ef-9106-00163e788596] Enter list_models, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>, kwargs: 2024-10-08 19:50:12,034 xinference.core.worker 278 DEBUG [request 79a0a496-856b-11ef-9106-00163e788596] Enter list_models, args: <xinference.core.worker.WorkerActor object at 0x7fe96d83b920>, kwargs: 2024-10-08 19:50:12,034 xinference.core.worker 278 DEBUG [request 79a0a496-856b-11ef-9106-00163e788596] Leave list_models, elapsed time: 0 s 2024-10-08 19:50:12,034 xinference.core.supervisor 278 DEBUG [request 79a09a82-856b-11ef-9106-00163e788596] Leave list_models, elapsed time: 0 s 2024-10-08 19:50:14,236 xinference.core.supervisor 278 DEBUG [request 7af0a594-856b-11ef-9106-00163e788596] Enter list_model_registrations, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>,LLM, kwargs: detailed=True 2024-10-08 19:50:14,331 xinference.core.supervisor 278 DEBUG [request 7af0a594-856b-11ef-9106-00163e788596] Leave list_model_registrations, elapsed time: 0 s 2024-10-08 19:50:18,570 xinference.core.supervisor 278 DEBUG [request 7d85f480-856b-11ef-9106-00163e788596] Enter list_model_registrations, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>,embedding, kwargs: detailed=True 2024-10-08 19:50:18,578 xinference.core.supervisor 278 DEBUG [request 7d85f480-856b-11ef-9106-00163e788596] Leave list_model_registrations, elapsed time: 0 s 2024-10-08 19:50:24,968 xinference.core.supervisor 278 DEBUG Enter launch_builtin_model, model_uid: bce-embedding-base_v1, model_name: bce-embedding-base_v1, model_size: , model_format: None, quantization: None, replica: 1, kwargs: {} 2024-10-08 19:50:24,969 xinference.core.worker 278 DEBUG Enter get_model_count, args: <xinference.core.worker.WorkerActor object at 0x7fe96d83b920>, kwargs: 2024-10-08 19:50:24,969 xinference.core.worker 278 DEBUG Leave get_model_count, elapsed time: 0 s 2024-10-08 19:50:24,969 xinference.core.worker 278 INFO [request 8156756c-856b-11ef-9106-00163e788596] Enter launch_builtin_model, args: <xinference.core.worker.WorkerActor object at 0x7fe96d83b920>, kwargs: model_uid=bce-embedding-base_v1-1-0,model_name=bce-embedding-base_v1,model_size_in_billions=None,model_format=None,quantization=None,model_engine=None,model_type=embedding,n_gpu=auto,request_limits=None,peft_model_config=None,gpu_idx=None,download_hub=None,model_path=None 2024-10-08 19:50:24,970 xinference.core.worker 278 DEBUG GPU selected: [0] for model bce-embedding-base_v1-1-0 2024-10-08 19:50:29,341 xinference.model.embedding.core 278 DEBUG Embedding model bce-embedding-base_v1 found in ModelScope. 2024-10-08 19:52:59,824 xinference.core.supervisor 278 DEBUG [request dda361cc-856b-11ef-9106-00163e788596] Enter list_models, args: <xinference.core.supervisor.SupervisorActor object at 0x7fe96d83b970>, kwargs: 2024-10-08 19:52:59,824 xinference.core.worker 278 DEBUG [request dda36b22-856b-11ef-9106-00163e788596] Enter list_models, args: <xinference.core.worker.WorkerActor object at 0x7fe96d83b920>, kwargs: 2024-10-08 19:52:59,824 xinference.core.worker 278 DEBUG [request dda36b22-856b-11ef-9106-00163e788596] Leave list_models, elapsed time: 0 s 2024-10-08 19:52:59,824 xinference.core.supervisor 278 DEBUG [request dda361cc-856b-11ef-9106-00163e788596] Leave list_models, elapsed time: 0 s
成功启动;或者启动有问题直接报错了,而不是卡住。
谢谢。
The text was updated successfully, but these errors were encountered:
请问解决了吗? 我后面也是要用a800的卡 看了两个人提issues有模型启动问题了- -
Sorry, something went wrong.
我自己测试主要还是网络问题,配置模型启动时携带指定模型路径会大大降低这种情况,例如:
xinference launch --model-name bge-reranker-large --model-type rerank --replica 3 --model_path /root/.cache/modelscope/hub/Xorbits/bge-reranker-large
请问解决了吗? 我后面也是要用a800的卡 看了两个人提issues有模型启动问题了- - 我自己测试主要还是网络问题,配置模型启动时携带指定模型路径会大大降低这种情况,例如: xinference launch --model-name bge-reranker-large --model-type rerank --replica 3 --model_path /root/.cache/modelscope/hub/Xorbits/bge-reranker-large
哦哦好的 我刚再去问了下 好像其他的人没遇到过这样问题 都正常 我准备用a800部署下 qwen2.5-72b-instruct vllm
请问解决了吗? 我后面也是要用a800的卡 看了两个人提issues有模型启动问题了- - 我自己测试主要还是网络问题,配置模型启动时携带指定模型路径会大大降低这种情况,例如: xinference launch --model-name bge-reranker-large --model-type rerank --replica 3 --model_path /root/.cache/modelscope/hub/Xorbits/bge-reranker-large 哦哦好的 我刚再去问了下 好像其他的人没遇到过这样问题 都正常 我准备用a800部署下 qwen2.5-72b-instruct vllm
我自己测试主要还是网络问题,配置模型启动时携带指定模型路径会大大降低这种情况,例如: xinference launch --model-name bge-reranker-large --model-type rerank --replica 3 --model_path /root/.cache/modelscope/hub/Xorbits/bge-reranker-large
如果是运行 qwen2.5-72b-instruct 这种 chat model, 我建议你可以尝试例如 ollama 会比较成熟,因为模型体积比较大一张 a800 也只能部署一个。在我的场景里使用 xinference 主要是对一些 embedding/rerank 类型模型支持的比较好。
仅供参考哈
No branches or pull requests
System Info / 系統信息
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xinference, version 0.15.1
The command used to start Xinference / 用以启动 xinference 的命令
docker run -d \ --name xinference \ -p 9997:9997 \ xprobe/xinference:v0.15.1 \ bash -c "xinference-local -H 0.0.0.0 --log-level debug & sleep 20 && xinference launch --model-name bge-reranker-large --model-type rerank --replica 2 && tail -f /dev/null"
Reproduction / 复现过程
执行 docker 启动后日志显示在加载模型,模型之前已经下载过。之前可以启动成功,不知道为啥现在启动不成功了。
日志:
Expected behavior / 期待表现
成功启动;或者启动有问题直接报错了,而不是卡住。
谢谢。
The text was updated successfully, but these errors were encountered: