Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]:query error JSONDecodeError after import more text. No error when less text #739

Open
2 tasks done
goodmaney opened this issue Jul 26, 2024 · 3 comments
Open
2 tasks done
Labels
bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer

Comments

@goodmaney
Copy link

goodmaney commented Jul 26, 2024

Is there an existing issue for this?

  • I have searched the existing issues
  • I have checked #657 to validate if my issue is covered by community support

Describe the bug

Import one books graphrag initialization and quering all succeeded. Import bigger txt files ,initialize succeeded but quering report json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Error parsing search response json
Traceback (most recent call last):
File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 194, in _map_response_single_batch
processed_response = self.parse_search_response(search_response)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 232, in parse_search_response
parsed_elements = json.loads(search_response)["points"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/init.py", line 346, in loads
return _default_decoder.decode(s)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce

LLM model:glm4-chat
Embedding model:bce-embedding-base-v1

import this book
book.txt

impor these books , quering reports JSONDecodeError
1804.07821v1.txt
2101.03961v3.txt
thinkos.txt

Expected Behavior

No response

GraphRAG Config Used

encoding_model: cl100k_base
skip_workflows: []
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_chat
model: glm4-chat-test
model_supports_json: false

api_base: http://127.0.0.1:9997/v1

parallelization:
stagger: 0.3

async_mode: threaded

embeddings:

async_mode: threaded
llm:
api_key: ${GRAPHRAG_API_KEY}
type: openai_embedding
model: bce-embedding-basev1
api_base: http://127.0.0.1:9998/v1

chunks:
size: 300
overlap: 100
group_by_columns: [id]

input:
type: file
file_type: text
base_dir: "input"
file_encoding: utf-8
file_pattern: ".*\.txt$"

cache:
type: file
base_dir: "cache"

storage:
type: file
base_dir: "output/${timestamp}/artifacts"

reporting:
type: file
base_dir: "output/${timestamp}/reports"

entity_extraction:

prompt: "prompts/entity_extraction.txt"
entity_types: [organization,person,geo,event]
max_gleanings: 0

summarize_descriptions:

prompt: "prompts/summarize_descriptions.txt"
max_length: 500

claim_extraction:

prompt: "prompts/claim_extraction.txt"
description: "Any claims or facts that could be relevant to information discovery."
max_gleanings: 0

community_report:

prompt: "prompts/community_report.txt"
max_length: 2000
max_input_length: 8000

cluster_graph:
max_cluster_size: 10

embed_graph:
enabled: false

umap:
enabled: false

snapshots:
graphml: false
raw_entities: false
top_level_nodes: false

local_search:

global_search:

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:0.1.1
  • Operating System:WLS2 Ubuntu 22.04
  • Python Version:3.12.4
  • Related Issues:
@goodmaney goodmaney added bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer labels Jul 26, 2024
@goodmaney goodmaney changed the title [Bug]:query error JSONDecodeError after import more text. Not error when less text [Bug]:query error JSONDecodeError after import more text. No error when less text Jul 27, 2024
@BaronHsu
Copy link

BaronHsu commented Aug 3, 2024

Is there an existing issue for this?

  • I have searched the existing issues
  • I have checked #657 to validate if my issue is covered by community support

Describe the bug

Import one books graphrag initialization and quering all succeeded. Import bigger txt files ,initialize succeeded but quering report json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Error parsing search response json Traceback (most recent call last): File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 194, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 232, in parse_search_response parsed_elements = json.loads(search_response)["points"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce

LLM model:glm4-chat Embedding model:bce-embedding-base-v1

import this book book.txt

impor these books , quering reports JSONDecodeError 1804.07821v1.txt 2101.03961v3.txt thinkos.txt

Expected Behavior

No response

GraphRAG Config Used

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat model: glm4-chat-test model_supports_json: false

api_base: http://127.0.0.1:9997/v1

parallelization: stagger: 0.3

async_mode: threaded

embeddings:

async_mode: threaded llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding model: bce-embedding-basev1 api_base: http://127.0.0.1:9998/v1

chunks: size: 300 overlap: 100 group_by_columns: [id]

input: type: file file_type: text base_dir: "input" file_encoding: utf-8 file_pattern: ".*.txt$"

cache: type: file base_dir: "cache"

storage: type: file base_dir: "output/${timestamp}/artifacts"

reporting: type: file base_dir: "output/${timestamp}/reports"

entity_extraction:

prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0

summarize_descriptions:

prompt: "prompts/summarize_descriptions.txt" max_length: 500

claim_extraction:

prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0

community_report:

prompt: "prompts/community_report.txt" max_length: 2000 max_input_length: 8000

cluster_graph: max_cluster_size: 10

embed_graph: enabled: false

umap: enabled: false

snapshots: graphml: false raw_entities: false top_level_nodes: false

local_search:

global_search:

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:0.1.1
  • Operating System:WLS2 Ubuntu 22.04
  • Python Version:3.12.4
  • Related Issues:

Hi, did you solve the question?
I found that work for me: search prompt
But seems like you aren't using ollama, you can still give it a try!
Hope it help you.

@goodmaney
Copy link
Author

Is there an existing issue for this?

  • I have searched the existing issues
  • I have checked #657 to validate if my issue is covered by community support

Describe the bug

Import one books graphrag initialization and quering all succeeded. Import bigger txt files ,initialize succeeded but quering report json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Error parsing search response json Traceback (most recent call last): File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 194, in _map_response_single_batch processed_response = self.parse_search_response(search_response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/site-packages/graphrag/query/structured_search/global_search/search.py", line 232, in parse_search_response parsed_elements = json.loads(search_response)["points"] ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/init.py", line 346, in loads return _default_decoder.decode(s) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 337, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/xx/anaconda3/envs/graph2/lib/python3.12/json/decoder.py", line 355, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Steps to reproduce

LLM model:glm4-chat Embedding model:bce-embedding-base-v1
import this book book.txt
impor these books , quering reports JSONDecodeError 1804.07821v1.txt 2101.03961v3.txt thinkos.txt

Expected Behavior

No response

GraphRAG Config Used

encoding_model: cl100k_base skip_workflows: [] llm: api_key: ${GRAPHRAG_API_KEY} type: openai_chat model: glm4-chat-test model_supports_json: false
api_base: http://127.0.0.1:9997/v1
parallelization: stagger: 0.3
async_mode: threaded
embeddings:
async_mode: threaded llm: api_key: ${GRAPHRAG_API_KEY} type: openai_embedding model: bce-embedding-basev1 api_base: http://127.0.0.1:9998/v1
chunks: size: 300 overlap: 100 group_by_columns: [id]
input: type: file file_type: text base_dir: "input" file_encoding: utf-8 file_pattern: ".*.txt$"
cache: type: file base_dir: "cache"
storage: type: file base_dir: "output/${timestamp}/artifacts"
reporting: type: file base_dir: "output/${timestamp}/reports"
entity_extraction:
prompt: "prompts/entity_extraction.txt" entity_types: [organization,person,geo,event] max_gleanings: 0
summarize_descriptions:
prompt: "prompts/summarize_descriptions.txt" max_length: 500
claim_extraction:
prompt: "prompts/claim_extraction.txt" description: "Any claims or facts that could be relevant to information discovery." max_gleanings: 0
community_report:
prompt: "prompts/community_report.txt" max_length: 2000 max_input_length: 8000
cluster_graph: max_cluster_size: 10
embed_graph: enabled: false
umap: enabled: false
snapshots: graphml: false raw_entities: false top_level_nodes: false
local_search:
global_search:

Logs and screenshots

No response

Additional Information

  • GraphRAG Version:0.1.1
  • Operating System:WLS2 Ubuntu 22.04
  • Python Version:3.12.4
  • Related Issues:

Hi, did you solve the question? I found that work for me: search prompt But seems like you aren't using ollama, you can still give it a try! Hope it help you.

Thanks, but not work. I change model to llama3.1 set temperature 0.3, it response some content after reporting json error, but not stable, if I initialize again maybe it not response just error .

@gudehhh666
Copy link

Hello!
Have you solve it now?
We I use text-embedding-3-large to generate_text_embeddings, I encountered the same errors and I wonder how to solve it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage Default label assignment, indicates new issue needs reviewed by a maintainer
Projects
None yet
Development

No branches or pull requests

3 participants