Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prompt: default prompt for KG builder #751

Merged
merged 1 commit into from
Jan 22, 2024
Merged

Conversation

wey-gu
Copy link
Contributor

@wey-gu wey-gu commented Jan 19, 2024

Previously, due to the fact we assumed "name" is special for an extracted node to actually act as its vertexID, and, there in NebulaGraph the id is actually composite 4 fields(src,dst,type,rank), thus it failed to extract "name" in edges.

In such case, the "name" field in schema of edge type will not be handled properly in some of LLMs.

Also, previously, the JSON format was not exactly an example, but a half example(nodes, edges) half json type schema(xxx:string, yyy:object).

This is actually not clear enough, which will cost mind-power of LLMs.

Now it's changed to a real example.

What type of PR is this?

  • bug
  • feature
  • enhancement

What problem(s) does this PR solve?

Issue(s) number: #746


To quicly verify this works fine on other (local) LLMs, paste this prompt:

As a knowledge graph AI importer, your task is to extract useful data from the following text:

---
What I Worked On
February 2021
Before college the two main things I worked on, outside of school, were writing and programming. I didn't write essays. I wrote what beginning writers were supposed to write then, and probably still are: short stories. My stories were awful. They had hardly any plot, just characters with strong feelings, which I imagined made them deep.
The first programs I tried writing were on the IBM 1401 that our school district used for what was then called "data processing." This was in 9th grade, so I was 13 or 14. The school district's 1401 happened to be in the basement of our junior high school, and my friend Rich Draves and I got permission to use it. It was like a mini Bond villain's lair down there, with all these alien-looking machines — CPU, disk drives, printer, card reader — sitting up on a raised floor under bright fluorescent lights.
---

The knowledge graph has the following schema, and node name is mandatory:

---
NodeType "entity" ("name":string )
EdgeType "relationship" ("name":string )
---

Only return the JSON, without explain or comment. The results should be in the following JSON format, where `"props":{...}` stands for JSON object format properties of the node or edge:

\```json
{
  "nodes":[{ "name":string,"type":string,"props":{...} }],
  "edges":[{ "src":string,"dst":string,"edgeType":string,"props":{...} }]
}
\```

Ensure the JSON is correctly formatted. Now, extract!
JSON:

Previously, due to the fact we assumed "name" is special for
a extracted node to actually act as its vertexID, and, there
in NebulaGraph the id is actually composite 4 fields(src,dst,type,rank),
thus there is no "name" fields in extracted JSON.

In such case, the "name" field in schema of edge type will not
be handled properly in some of LLMs.

Also, previously, the JSON format was not exactly an example, but
a half example(nodes, edges) half json type schema(xxx:string, yyy:object).

This is actually not clear enough, which will cost mind-power of LLMs.

Now it's changed to a real example.
Copy link
Contributor

@mizy mizy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@huaxiabuluo huaxiabuluo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@huaxiabuluo huaxiabuluo merged commit 9c9bda4 into master Jan 22, 2024
1 check passed
@wey-gu wey-gu deleted the optmize_importer_prompt branch January 22, 2024 02:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants