Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MING-7B http接口超时 #28

Open
coldy1992 opened this issue May 20, 2024 · 3 comments
Open

MING-7B http接口超时 #28

coldy1992 opened this issue May 20, 2024 · 3 comments

Comments

@coldy1992
Copy link

成功运行 controller、worker 和 api(api 启动端口21003)后,调用/v1/chat/completions接口:
curl --location --request POST 'http://localhost:21003/v1/chat/completions' --header 'Content-Type: application/json' --data-raw '{
"model": "MING-7B",
"messages": [
{"role": "user", "content": "Hello!"}
],
"n": 1
}'
调用报错,看起来是超时了,Traceback:
Traceback (most recent call last):
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_transports/default.py", line 69, in map_httpcore_exceptions
yield
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
raise exc from None
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
response = await connection.handle_async_request(
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/connection.py", line 101, in handle_async_request
return await self._connection.handle_async_request(request)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/http11.py", line 143, in handle_async_request
raise exc
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/http11.py", line 113, in handle_async_request
) = await self._receive_response_headers(**kwargs)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/http11.py", line 186, in _receive_response_headers
event = await self._receive_event(timeout=timeout)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_async/http11.py", line 224, in _receive_event
data = await self._network_stream.read(
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_backends/anyio.py", line 37, in read
return b""
File "/data/miniconda3/envs/ming-7b/lib/python3.9/contextlib.py", line 137, in exit
self.gen.throw(typ, value, traceback)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 411, in run_asgi
result = await app( # type: ignore[func-returns-value]
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 69, in call
return await self.app(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in call
await super().call(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/applications.py", line 123, in call
await self.middleware_stack(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in call
raise exc
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in call
await self.app(scope, receive, _send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in call
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/routing.py", line 756, in call
await self.middleware_stack(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/routing.py", line 776, in app
await route.handle(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle
await self.app(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/routing.py", line 77, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
raise exc
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
await app(scope, receive, sender)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/starlette/routing.py", line 72, in app
response = await func(request)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/fastapi/routing.py", line 278, in app
raw_response = await run_endpoint_function(
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/fastapi/routing.py", line 191, in run_endpoint_function
return await dependant.call(**values)
File "/data/apps/ming/src/fastchat/serve/api.py", line 52, in create_chat_completion
content = await chat_completion(request.model, payload, skip_echo_len)
File "/data/apps/ming/src/fastchat/serve/api.py", line 135, in chat_completion
async with client.stream("POST", worker_addr + "/worker_generate_stream",
File "/data/miniconda3/envs/ming-7b/lib/python3.9/contextlib.py", line 181, in aenter
return await self.gen.anext()
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_client.py", line 1617, in stream
response = await self.send(
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_client.py", line 1661, in send
response = await self._send_handling_auth(
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_client.py", line 1689, in _send_handling_auth
response = await self._send_handling_redirects(
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_client.py", line 1726, in _send_handling_redirects
response = await self._send_single_request(request)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_client.py", line 1763, in _send_single_request
response = await transport.handle_async_request(request)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_transports/default.py", line 373, in handle_async_request
resp = await self._pool.handle_async_request(req)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/contextlib.py", line 137, in exit
self.gen.throw(typ, value, traceback)
File "/data/miniconda3/envs/ming-7b/lib/python3.9/site-packages/httpx/_transports/default.py", line 86, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout
这种超时是什么原因造成的,有大佬遇到过吗?

@coldy1992
Copy link
Author

@BlueZeros 大佬,求助

@BlueZeros
Copy link
Collaborator

@coldy1992 哈喽,不好意思我也没试过用api接口的方式调用模型,不过如果你用的是MING-7b或者MING-1.8b版本的话,可以使用fastchat提供的脚本。我会尽量在下个版本的代码中把这个功能加进来 :)

@coldy1992
Copy link
Author

@coldy1992 哈喽,不好意思我也没试过用api接口的方式调用模型,不过如果你用的是MING-7b或者MING-1.8b版本的话,可以使用fastchat提供的脚本。我会尽量在下个版本的代码中把这个功能加进来 :)

@BlueZeros hi,我怀疑 fastchat 中的代码有一处错误,可能会导致 api 调用和 test_message 脚本运行失败,请帮忙确认:
分支:main
代码:fastchat/serve/model_worker.py 的第130行,会调用到 /fastchat/serve/inference.py 的第101行(generate_stream 函数),由于 generate_stream() 入参前 5 个是必选,model_worker.py 130行传入的第5个参数context_len会被 generate_stream() 解读为 beam_size 导致出错。
请大佬审查。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants