Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Oracle database as all type of storage (KV/vector/graph) #257

Merged
merged 17 commits into from
Nov 12, 2024

Conversation

jin38324
Copy link
Contributor

@jin38324 jin38324 commented Nov 12, 2024

This PR introduces Oracle Database support to the LightRAG project, covering key-value (KV), vector, and graph storage.

The integration leverages Oracle's powerful multi-model query capabilities, allowing users to work seamlessly with relational, JSON, and graph data types. In particular, Oracle 23c's vector support and enhanced JSON and graph querying bring significant flexibility and performance improvements to LightRAG, making it easier to integrate in enterprise environments that rely on Oracle infrastructure.

Contributor

@jin38324 @tmuife

Key Features

  • Unified Knowledge Base Management: All documents are stored in a single table, with the WORKSPACE field distinguishing between different knowledge bases. This eliminates the need to create separate tables for each knowledge base, simplifying data management.

  • Key-Value Storage Integration: LightRAG now supports Oracle Database as its key-value storage backend. Oracle has efficient JSON data handling, but here we use relational tables to store KV data, with documents in the LIGHTRAG_DOC_FULL table and chunks in the LIGHTRAG_DOC_CHUNKS table, simplifying the structure and facilitating quick access.

  • Vector Storage Integration: Leveraging Oracle 23ai’s vector capabilities, LightRAG enables high-performance vectorized text retrieval, supporting similarity searches for more precise, context-aware answers. Vectors are stored directly in table columns, without requiring separate databases or tables, enhancing ease of use and efficiency.

  • Graph Storage Integration: Oracle's advanced graph storage enables LightRAG to manage complex entity relationships, supporting richer, more contextually relevant responses through its two-tier retrieval framework. Entity nodes and edges are stored in LIGHTRAG_GRAPH_NODES and LIGHTRAG_GRAPH_EDGES tables, and graph data can be queried directly with SQL through a GRAPH view.

Importance of This Change

Integrating Oracle Database aligns with LightRAG's objective of expanding RAG systems to handle complex data structures and interdependencies. Oracle's robust data management features ensure that LightRAG maintains high performance, flexibility, and security, even at scale.

Additional Information

Backward Compatibility: This enhancement is fully compatible with existing storage setups, allowing Oracle to be added as an optional backend without impacting current configurations.

Testing: Extensive tests have been implemented to validate Oracle’s performance and functionality as a KV, vector, and graph storage backend across diverse retrieval scenarios.

Example: Sample code demonstrating the use of Oracle Database as a storage solution is provided in the examples.

@LarFii
Copy link
Collaborator

LarFii commented Nov 12, 2024

Thanks for your excellent contribution, but there are some linting errors. Please make sure to run pre-commit run --all-files before submitting to ensure all linting checks pass.

@jin38324
Copy link
Contributor Author

Thanks for your excellent contribution, but there are some linting errors. Please make sure to run pre-commit run --all-files before submitting to ensure all linting checks pass.

pre commit have fixed.

@LarFii LarFii merged commit 5a1e657 into HKUDS:main Nov 12, 2024
1 check passed
@LarFii
Copy link
Collaborator

LarFii commented Nov 12, 2024

Thanks. I have merged it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants