Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: infiniflow/ragflow:v0.14.0 BAICHUAN embedding model trigger exception #3657

Open
1 task done
neo-workship opened this issue Nov 26, 2024 · 0 comments
Open
1 task done
Labels
bug Something isn't working

Comments

@neo-workship
Copy link

Is there an existing issue for the same bug?

  • I have checked the existing issues.

RAGFlow workspace code commit ID

infiniflow/ragflow:v0.14.0 镜像

RAGFlow image version

v0.14.0 full

Other environment information

install environment: Docker Desktop
ragflow version: v0.14.0 full

Actual behavior

test uses 百川 Baichuan-Text-Embedding, ollama :nomic-embed-text , tongyi :Text-Embedding-v3

using the Knowledge Graph method parse chinese. The problem is only with the Baichuan-text-embedding (the interface of Baichuan is compatible with the openai API, so the embedding model can be used when I use graphrag alone).

Expected behavior

image

log

2024-11-26 16:40:52 2024-11-26 16:40:52,640 WARNING  17 /ragflow/.venv/lib/python3.10/site-packages/networkx/readwrite/json_graph/node_link.py:142: FutureWarning: 
2024-11-26 16:40:52 The default value will be `edges="edges" in NetworkX 3.6.
2024-11-26 16:40:52 
2024-11-26 16:40:52 To make this warning go away, explicitly set the edges kwarg, e.g.:
2024-11-26 16:40:52 
2024-11-26 16:40:52   nx.node_link_data(G, edges="links") to preserve current behavior, or
2024-11-26 16:40:52   nx.node_link_data(G, edges="edges") for forward compatibility.
2024-11-26 16:40:52   warnings.warn(
2024-11-26 16:40:52 
2024-11-26 16:40:52 2024-11-26 16:40:52,973 INFO     17 HTTP Request: POST https://api.deepseek.com/v1/chat/completions "HTTP/1.1 200 OK"

............

2024-11-26 16:41:26 openai.BadRequestError: Error code: 400 - {'error': {'code': None, 'param': None, 'type': 'invalid_request_error', 'message': 'Input batch list exceeds the limit of 16.'}}
2024-11-26 16:41:26 2024-11-26 16:41:26,051 ERROR    17 handle_task got exception for task {"id": "431446bcabd111ef999b0242ac120006", "doc_id": "4147acacabd111ef8f5c0242ac120006", "from_page": 0, "to_page": 100000000, "retry_count": 0, "kb_id": "2e0a28feabd111ef8abc0242ac120006", "parser_id": "knowledge_graph", "parser_config": {"entity_types": ["organization", "person", "location", "event", "time"], "chunk_token_num": 8192, "delimiter": "\\n!?;\u3002\uff1b\uff01\uff1f", "layout_recognize": true}, "name": "hd.txt", "type": "doc", "location": "hd.txt", "size": 1826, "tenant_id": "01a5f070abce11efbf5a0242ac120006", "language": "Chinese", "embd_id": "Baichuan-Text-Embedding@BaiChuan", "img2txt_id": "qwen-vl-max@Tongyi-Qianwen", "asr_id": "paraformer-realtime-8k-v1@Tongyi-Qianwen", "llm_id": "deepseek-chat@DeepSeek", "update_time": 1732610073371}
2024-11-26 16:41:26 Traceback (most recent call last):
2024-11-26 16:41:26   File "/ragflow/rag/svr/task_executor.py", line 448, in handle_task
2024-11-26 16:41:26     do_handle_task(task)
2024-11-26 16:41:26   File "/ragflow/rag/svr/task_executor.py", line 402, in do_handle_task
2024-11-26 16:41:26     tk_count, vector_size = embedding(cks, embd_mdl, r["parser_config"], callback)
2024-11-26 16:41:26   File "/ragflow/rag/svr/task_executor.py", line 306, in embedding
2024-11-26 16:41:26     vts, c = mdl.encode(cnts[i: i + batch_size])
2024-11-26 16:41:26   File "/ragflow/api/db/services/llm_service.py", line 209, in encode
2024-11-26 16:41:26     emd, used_tokens = self.mdl.encode(texts, batch_size)
2024-11-26 16:41:26   File "/ragflow/rag/llm/embedding_model.py", line 106, in encode
2024-11-26 16:41:26     res = self.client.embeddings.create(input=texts,
2024-11-26 16:41:26   File "/ragflow/.venv/lib/python3.10/site-packages/openai/resources/embeddings.py", line 125, in create
2024-11-26 16:41:26     return self._post(
2024-11-26 16:41:26   File "/ragflow/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1260, in post
2024-11-26 16:41:26     return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
2024-11-26 16:41:26   File "/ragflow/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 937, in request
2024-11-26 16:41:26     return self._request(
2024-11-26 16:41:26   File "/ragflow/.venv/lib/python3.10/site-packages/openai/_base_client.py", line 1041, in _request
2024-11-26 16:41:26     raise self._make_status_error_from_response(err.response) from None
2024-11-26 16:41:26 openai.BadRequestError: Error code: 400 - {'error': {'code': None, 'param': None, 'type': 'invalid_request_error', 'message': 'Input batch list exceeds the limit of 16.'}}
2024-11-26 16:41:32 2024-11-26 16:41:32,581 INFO     17 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-11-26T16:41:32.580988", "boot_at": "2024-11-26T16:10:31.078836", "pending": 0, "lag": 0, "done": 1, "failed": 1, "current": null}
2024-11-26 16:41:37 2024-11-26 16:41:37,252 INFO     18 172.18.0.6 - - [26/Nov/2024 16:41:37] "GET /v1/document/list?kb_id=2e0a28feabd111ef8abc0242ac120006&keywords=&page_size=10&page=1 HTTP/1.1" 200 -
2024-11-26 16:42:02 2024-11-26 16:42:02,612 INFO     17 task_consumer_0 reported heartbeat: {"name": "task_consumer_0", "now": "2024-11-26T16:42:02.611833", "boot_at": "2024-11-26T16:10:31.078836", "pending": 0, "lag": 0, "done": 1, "failed": 1, "current": null}

Steps to reproduce

- select Knowledge Graph method
- select embedding model: Baichuan-Text-Embedding 、ollama nomic-embed-text 、tongyi text-embedding-v3
- only Baichuan-Text-Embedding Trigger error,it's api Compatible with oepnai api,but using graphrag alone, no problem

Additional information

No response

@neo-workship neo-workship added the bug Something isn't working label Nov 26, 2024
@JinHai-CN JinHai-CN changed the title [Bug]: infiniflow/ragflow:v0.14.0 百川embedding模型触发异常 [Bug]: infiniflow/ragflow:v0.14.0 BAICHUAN embedding model trigger exception Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant