Skip to content

Commit

Permalink
Introduced a simple demo of mixed index and text2gremlin demonstratio…
Browse files Browse the repository at this point in the history
…n. (#54)

* 1. 修改配置方式,整体配置简单易用。
2. Graph RAG增加向量索引和混合索引加重排的两种方式的索引。
3. 增加自然语言转gremlin的任务及演示。

* reafact: add conn check after applying configs (WIP)

TODO: lack some check now

* fix: use log to print some info

* feat(llm): support qianfan platform for embedding & replace wenxin refer

* 1. 修复了导入和搜索的bug

* fix upload document

* fix the non-disambiguate uploading

* chore: exclude binary files & update params order

* fix: openai params not match

* log: record error in py-client with orange color

for debug RESTful API problem

* 1. added triples and edges id setting
2. added template input textbox
3. changed ernie llm to use qianfan sdk
4. added function of filtering id length

* refact: update the prompt template

Also rename the same params (inner & outer)

TODO: Non-Schema mode will throw Error now (we need fix it)

* update: add config file argument and update to build semantic vertex-id querying

- Update BuildSemanticIndex class to match new file naming convention.
- Modify code_format_and_analysis.sh to use a line length of 120.
- Change logging format to use %s instead of {} for string formatting.

* feat: optimize index loading and add clean-up button

1. Introduce a clean-up function to remove index and content files, aiding in maintaining a clean file system after operations.
2. Introduce default demo of build kg.
3. Now it does not clean the kg before building kg by default.
4. Clean the stopwords files.

* Add generating operation of config file in README.md

* fix: use lower case to compare str & enhance the extract prompt

* fix: Fix to add triples from extracted vertex, and fix the format of triples.

* doc: update readme

* feat: use custom_handler log color

* fix: avoid FSRF security warning

* doc: update readme

* feat: change config store method

* chore: use a flexible version for dependencies

* feat(hugegraph_llm): supporting property graph extract and primary key id. (#2)

* fix: token type should be int (WIP)

init graphspace support

* fix: read int parameter from .env

* feat: use .env as default config file

* fix: Add a 'copy' button to the output box

* Support local openai env (#1)

* support local openai env

* fix env

* refact: set gpt-4o-mini as the openai default type

* feat: Add rag option for four output type including "llm-raw", "graph-only", "vector-only", "graph-vector"

* feat: Add vector result output in cmd.

* feat: Add error catch and display it on the front-end web interface, fix code style (#3)

* fix: fix the triple extract

* 修改格式,和属性图模式

* 修复代码样式,并增加错误抓取和传输到前端界面

* revert: graphspace init for CI

---------

Co-authored-by: imbajin <jin@apache.org>
Co-authored-by: chenzihong <522023320011@smail.nju.edu.cn>
Co-authored-by: Liu Jiajun <85552719+jasinliu@users.noreply.github.com>
  • Loading branch information
4 people authored Jul 23, 2024
1 parent a805bb2 commit 3a3698b
Show file tree
Hide file tree
Showing 81 changed files with 4,148 additions and 1,030 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/pylint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,7 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install pylint pytest
pip install -r ./hugegraph-llm/requirements.txt
pip install -r ./hugegraph-llm/llm_api/requirements.txt
pip install -r ./hugegraph-llm/requirements.txt
pip install -r ./hugegraph-python-client/requirements.txt
- name: Analysing the code with pylint
run: |
Expand Down
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,9 @@
# User-specific files
**/logs/*.log*
*.faiss
*.pkl
out/production/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand Down
1 change: 1 addition & 0 deletions .licenserc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ header: # `header` section is configurations for source codes license header.
- '**/*.ipr'
- '**/META-INF/MANIFEST.MF'
- '.repository/**'
- '**/resources/**'

comment: on-failure
# on what condition license-eye will comment on the pull request, `on-failure`, `always`, `never`.
Expand Down
2 changes: 1 addition & 1 deletion hugegraph-llm/MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@
# specific language governing permissions and limitations
# under the License.

recursive-include src/hugegraph_llm/config *
recursive-include src/hugegraph_llm/resources *
56 changes: 32 additions & 24 deletions hugegraph-llm/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,46 +17,54 @@ graph systems and large language models.

## Environment Requirements

- python 3.8+
- python 3.9+
- hugegraph 1.0.0+

## Preparation

- Start the HugeGraph database, you can do it via Docker. Refer to [docker-link](https://hub.docker.com/r/hugegraph/hugegraph) & [deploy-doc](https://hugegraph.apache.org/docs/quickstart/hugegraph-server/#31-use-docker-container-convenient-for-testdev) for guidance
- Start the gradio interactive demo, you can start with the following command, and open http://127.0.0.1:8001 after starting
- Start the HugeGraph database, you can do it via Docker/[Binary packages](https://hugegraph.apache.org/docs/download/download/).
Refer to [docker-link](https://hub.docker.com/r/hugegraph/hugegraph) & [deploy-doc](https://hugegraph.apache.org/docs/quickstart/hugegraph-server/#31-use-docker-container-convenient-for-testdev) for guidance
- Clone this project
```bash
# 0. clone the hugegraph-ai project & enter the root dir
# 1. configure the environment path
PROJECT_ROOT_DIR = "/path/to/hugegraph-ai" # root directory of hugegraph-ai
export PYTHONPATH=${PROJECT_ROOT_DIR}/hugegraph-llm/src:${PROJECT_ROOT_DIR}/hugegraph-python-client/src

# 2. install the required packages/deps (better to use virtualenv(venv) to manage the environment)
cd hugegraph-llm
pip install -r requirements.txt # ensure the python/pip version is satisfied
# 2.1 set basic configs in the hugegraph-llm/config/config.ini (Optional, you can also set it in gradio)

# 3. start the gradio server, wait for some time to initialize
python3 ./src/hugegraph_llm/utils/gradio_demo.py
```
- Configure HugeGraph database connection information & LLM information in the gradio interface,
click on `Initialize configs`, the complete and initialized configuration file will be overwritten.
- offline download NLTK stopwords
git clone https://github.com/apache/incubator-hugegraph-ai.git
```
- Install [hugegraph-python-client](../hugegraph-python-client) and [hugegraph_llm](src/hugegraph_llm)
```bash
cd ./incubator-hugegraph-ai # better to use virtualenv (source venv/bin/activate)
pip install ./hugegraph-python-client
pip install -r ./hugegraph-llm/requirements.txt
```
- Enter the project directory
```bash
cd ./hugegraph-llm/src
```
- Generate the config file
```bash
python3 ./src/hugegraph_llm/operators/common_op/nltk_helper.py
python3 -m hugegraph_llm.config.generate
```
- Start the gradio interactive demo of **Graph RAG**, you can start with the following command, and open http://127.0.0.1:8001 after starting
```bash
python3 -m hugegraph_llm.demo.rag_web_demo
```

- Or start the gradio interactive demo of **Text2Gremlin**, you can start with the following command, and open http://127.0.0.1:8002 after starting
```bash
python3 -m hugegraph_llm.demo.gremlin_generate_web_demo
```

## Examples

### 1.Build a knowledge graph in HugeGraph through LLM

Run example like `python3 ./hugegraph-llm/examples/build_kg_test.py`
Run example like `python3 ./hugegraph_llm/examples/build_kg_test.py`

The `KgBuilder` class is used to construct a knowledge graph. Here is a brief usage guide:

1. **Initialization**: The `KgBuilder` class is initialized with an instance of a language model. This can be obtained from the `LLMs` class.
1. **Initialization**: The `KgBuilder` class is initialized with an instance of a language model.
This can be obtained from the `LLMs` class.

```python
from hugegraph_llm.llms.init_llm import LLMs
from hugegraph_llm.models.llms.init_llm import LLMs
from hugegraph_llm.operators.kg_construction_task import KgBuilder
TEXT = ""
Expand Down Expand Up @@ -111,7 +119,7 @@ The methods of the `KgBuilder` class can be chained together to perform a sequen

### 2. Retrieval augmented generation (RAG) based on HugeGraph

Run example like `python3 ./hugegraph-llm/examples/graph_rag_test.py`
Run example like `python3 ./hugegraph_llm/examples/graph_rag_test.py`

The `GraphRAG` class is used to integrate HugeGraph with large language models to provide retrieval-augmented generation capabilities.
Here is a brief usage guide:
Expand Down
19 changes: 0 additions & 19 deletions hugegraph-llm/llm_api/README.md

This file was deleted.

163 changes: 0 additions & 163 deletions hugegraph-llm/llm_api/main.py

This file was deleted.

6 changes: 0 additions & 6 deletions hugegraph-llm/llm_api/requirements.txt

This file was deleted.

18 changes: 13 additions & 5 deletions hugegraph-llm/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,13 @@
openai==0.28.1
retry==0.9.2
tiktoken==0.7.0
nltk==3.8.1
gradio==4.37.2
openai~=0.28.1
ollama~=0.2.1
qianfan~=0.3.18
retry~=0.9.2
tiktoken>=0.7.0
nltk~=3.8.1
gradio>=4.37.2
jieba>=0.42.1
numpy~=1.24.4
python-docx~=1.1.2
langchain-text-splitters~=0.2.2
faiss-cpu~=1.8.0
python-dotenv>=1.0.1
16 changes: 16 additions & 0 deletions hugegraph-llm/src/hugegraph_llm/config/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,3 +14,19 @@
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.


__all__ = [
"settings",
"resource_path"
]

import os
from .config import Config


settings = Config()
settings.from_env()

package_path = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
resource_path = os.path.join(package_path, "resources")
Loading

0 comments on commit 3a3698b

Please sign in to comment.