feat: add llm wenxinyiyan & config util & spo_triple_extract #27

simon824 · 2024-01-18T10:39:05Z

add llm wenxinyiyan
add spo_triple_extract & CommitSPOToKg
add config util
code style format

spo triple extraction

input

"Meet Sarah, a 30-year-old attorney, and her roommate, James, whom she's shared a home with"
        " since 2010. James, in his professional life, works as a journalist. Additionally, Sarah"
        " is the proud owner of the website www.sarahsplace.com, while James manages his own"
        " webpage, though the specific URL is not mentioned here. These two individuals, Sarah and"
        " James, have not only forged a strong personal bond as roommates but have also carved out"
        " their distinctive digital presence through their respective webpages, showcasing their"
        " varied interests and experiences."

output

simon824 · 2024-01-22T11:33:37Z

@imbajin @lzyxx77 @liuxiaocs7 PTAL, thanks!

liuxiaocs7

Overall looks good to me, left some minor comments

wenxinyiyan could be replaced with ERNIE or ERNIE-Bot?
because: https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Nlks5zkzu and more international (at this point CC: @imbajin)
There are many automatic formatting modifications in this pr. could we add corresponding commands in the readme or other locations to stay code style unified.

hugegraph-llm/examples/build_kg_test.py

hugegraph-llm/examples/graph_rag_test.py

hugegraph-llm/src/hugegraph_llm/llms/init_llm.py

hugegraph-llm/src/hugegraph_llm/operators/kg_construction_task.py

simon824 · 2024-01-23T02:57:40Z

style/code_format_and_analysis.sh

+  we need to manually fix all the warnings mentioned below before commit! "
+  export PYTHONPATH=${ROOT_DIR}/hugegraph-llm/src:${ROOT_DIR}/hugegraph-python-client/src
+  pylint --rcfile=${ROOT_DIR}/style/pylint.conf ${ROOT_DIR}/hugegraph-llm
+  #pylint --rcfile=${ROOT_DIR}/style/pylint.conf ${ROOT_DIR}/hugegraph-python-client


will fix hugegraph-python-client code style in next pr

README.md

hugegraph-llm/src/config/config.ini

hugegraph-llm/src/hugegraph_llm/llms/ernie_bot.py

hugegraph-llm/src/hugegraph_llm/operators/llm_op/info_extract.py

hugegraph-llm/src/hugegraph_llm/utils/config.py

hugegraph-llm/examples/graph_rag_test.py

hugegraph-llm/src/hugegraph_llm/operators/llm_op/info_extract.py

liuxiaocs7

+1

hugegraph-llm/examples/build_kg_test.py

hugegraph-llm/src/hugegraph_llm/operators/common_op/print_result.py

hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/commit_to_hugegraph.py

imbajin · 2024-01-24T05:16:00Z

hugegraph-llm/src/hugegraph_llm/operators/llm_op/info_extract.py

+    return """You are a data scientist working for a company that is building a graph database.
+    Your task is to extract information from data and convert it into a graph database. Provide a 
+    set of Nodes in the form [ENTITY_ID, TYPE, PROPERTIES] and a set of relationships in the form 
+    [ENTITY_ID_1, RELATIONSHIP, ENTITY_ID_2, PROPERTIES] and a set of NodesSchemas in the form [
+    ENTITY_TYPE, PRIMARY_KEY, PROPERTIES] and a set of RelationshipsSchemas in the form [
+    ENTITY_TYPE_1, RELATIONSHIP, ENTITY_TYPE_2, PROPERTIES] It is important that the ENTITY_ID_1 
+    and ENTITY_ID_2 exists as nodes with a matching ENTITY_ID. If you can't pair a relationship 
+    with a pair of nodes don't add it. When you find a node or relationship you want to add try 
+    to create a generic TYPE for it that  describes the entity you can also think of it as a label.
+
+    Here is an example The input you will be given: Data: Alice lawyer and is 25 years old and Bob 
+    is her roommate since 2001. Bob works as a journalist. Alice owns a the webpage www.alice.com 
+    and Bob owns the webpage www.bob.com. The output you need to provide: Nodes: ["Alice", "Person", 
+    {"age": 25, "occupation": "lawyer", "name": "Alice"}], ["Bob", "Person", {"occupation": 
+    "journalist", "name": "Bob"}], ["alice.com", "Webpage", {"name": "alice.com", 
+    "url": "www.alice.com"}], ["bob.com", "Webpage", {"name": "bob.com", "url": "www.bob.com"}] 
+    Relationships: [{"Person": "Alice"}, "roommate", {"Person": "Bob"}, {"start": 2021}], 
+    [{"Person": "Alice"}, "owns", {"Webpage": "alice.com"}, {}], [{"Person": "Bob"}, "owns",
+     {"Webpage": "bob.com"}, {}] NodesSchemas: ["Person", "name",  {"age": "int", 
+     "name": "text", "occupation": 
+    "text"}],  ["Webpage", "name", {"name": "text", "url": "text"}] RelationshipsSchemas :["Person", 
+    "roommate", "Person", {"start": "int"}], ["Person", "owns", "Webpage", {}]"""


Do we have a better way to maintain the default prompt if we make subsequent modifications and adjustments

README.md

imbajin

Almost LGTM

Zony7

LGTM

add config util & add wenxinyiyan

1cfc205

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. enhancement New feature or request labels Jan 18, 2024

simon824 marked this pull request as draft January 18, 2024 10:40

simon824 added 2 commits January 19, 2024 09:28

fix code style

acbce73

fix code style

1491184

simon824 marked this pull request as ready for review January 19, 2024 01:58

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Jan 19, 2024

add spo triple info extraction

9d28685

dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Jan 22, 2024

add spo triple info extraction

d948d24

simon824 changed the title ~~feat: add llm wenxinyiyan & config util~~ feat: add llm wenxinyiyan & config util & spo_triple_extract Jan 22, 2024

simon824 added 2 commits January 22, 2024 19:37

add spo triple info extraction

3a6d4e6

add spo triple info extraction

ef55e50

liuxiaocs7 reviewed Jan 22, 2024

View reviewed changes

simon824 added 4 commits January 23, 2024 10:47

add code_format_and_analysis.sh

5d68938

add code_format_and_analysis.sh

021dd5b

add code_format_and_analysis.sh

3a63fd6

add code_format_and_analysis.sh

bd1facd

simon824 commented Jan 23, 2024

View reviewed changes

liuxiaocs7 reviewed Jan 23, 2024

View reviewed changes

fix code style

05d2269

liuxiaocs7 approved these changes Jan 23, 2024

View reviewed changes

javeme reviewed Jan 23, 2024

View reviewed changes

simon824 added 2 commits January 23, 2024 18:28

fix code style

be70e35

fix code style

01d4485

imbajin reviewed Jan 24, 2024

View reviewed changes

Update README.md

7855925

imbajin previously approved these changes Jan 24, 2024

View reviewed changes

Update README.md

0a29a11

imbajin dismissed their stale review via 0a29a11 January 24, 2024 05:22

imbajin approved these changes Jan 24, 2024

View reviewed changes

Zony7 approved these changes Jan 24, 2024

View reviewed changes

imbajin merged commit 143e29f into apache:main Jan 24, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add llm wenxinyiyan & config util & spo_triple_extract #27

feat: add llm wenxinyiyan & config util & spo_triple_extract #27

simon824 commented Jan 18, 2024 •

edited

Loading

simon824 commented Jan 22, 2024

liuxiaocs7 left a comment •

edited

Loading

simon824 Jan 23, 2024

liuxiaocs7 left a comment

imbajin Jan 24, 2024

imbajin left a comment

Zony7 left a comment

feat: add llm wenxinyiyan & config util & spo_triple_extract #27

feat: add llm wenxinyiyan & config util & spo_triple_extract #27

Conversation

simon824 commented Jan 18, 2024 • edited Loading

spo triple extraction

simon824 commented Jan 22, 2024

liuxiaocs7 left a comment • edited Loading

Choose a reason for hiding this comment

simon824 Jan 23, 2024

Choose a reason for hiding this comment

liuxiaocs7 left a comment

Choose a reason for hiding this comment

imbajin Jan 24, 2024

Choose a reason for hiding this comment

imbajin left a comment

Choose a reason for hiding this comment

Zony7 left a comment

Choose a reason for hiding this comment

simon824 commented Jan 18, 2024 •

edited

Loading

liuxiaocs7 left a comment •

edited

Loading