Skip to content

[Bug]: Updating index twice will result in a ValueError #1822

@masies

Description

@masies

Do you need to file an issue?

  • I have searched the existing issues and this bug is not already filed.
  • My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
  • I believe this is a legitimate bug, not just a question. If this is a question, please use the Discussions area.

Describe the bug

Running update_output twice throws
ValueError: Could not find update_output/YYYYMMDD-HHMMSS/delta/communities.parquet in storage!

this issue is related to here

key = self._keyname(key) becomes (in my case) 'output\update_output\YYYYMMDD-HHMMSS\delta\communities.parquet'
I guess this is because of the logic from the previous version

Steps to reproduce

  1. init the graphrag root dir,
  2. create the index
  3. add a file and update the index
  4. add another file and update the index

Expected Behavior

it should update normally the index once again, with both documents added in step 3 and 4 present in the index

GraphRAG Config Used

...
update_index_output:
    type: blob 
    provider: azure
    storage_account_blob_url: ${BLOB_STORAGE_URL}
    container_name: ${ROOT_DIR}
    base_dir: "update_output"
...

Logs and screenshots

No response

Additional Information

  • GraphRAG Version: 2.1.0
  • Operating System: win 11
  • Python Version: 3.14
  • Related Issues:

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriageDefault label assignment, indicates new issue needs reviewed by a maintainer

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions