Search Refinement #393

Bubbletea98 · 2024-06-23T03:39:30Z

Description

Resolve #363

Added refinement function to extract key answers by running LLM with input query.

Type of change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Maintenance
New release

Related issues

Mention related GitHub and Linear issues. E.g. Closes #xxx or Fixes #xxx. Otherwise delete this section.

Checklists

To speed up the review process, please follow these checklists:

Development

The Pull Request is small and focused on one topic
Lint rules pass locally (make format && make lint)
The code changed/added as part of this pull request has been covered with tests
All tests related to the changed code pass in development (make test)
The changes generate no new warnings (or explain any new warnings and why they're ok)
Commit messages are detailed
Changed code is self-explanatory and/or I added comments
I updated the documentation (docstrings, /docs)
See the testing guidelines for help on tests, especially those involving web services.

Code review

This pull request has a descriptive title and information useful to a reviewer. There may be a screenshot or screencast attached.
I have performed a self-review of my code
Issue from task tracker has a link to this pull request

💔 Thank you for submitting a pull request!

…hecking input query with llm model

20001LastOrder · 2024-06-25T14:25:48Z

@Eyobyb . I'm good with these changes. Please also take a look at them.

Eyobyb · 2024-07-01T23:08:01Z

"Have you encountered hallucinations with this? It adds unwanted details and, rather than rearranging them, it elaborates on them.

Take this example:


document = [
     "Iron Man fears Hulk more than anybody.",
     "Hulk was named the strongest Avenger on Sakaar.",
     "Natasha loves Bruce Banner.",
     "SHIELD built a contingency plan only for Hulk if he gets angry."
]

query = "Why is Hulk the strongest Avenger?"

It returns all of them in the same sequence, but it elaborates on each document to make sense based on the question."

20001LastOrder · 2024-07-01T23:26:48Z

This is an interesting observation. I think in general if the answer is relevant I'm fine that the LLM elaborates them. But we should probably fix the prompt so that it filters documents that are not relevant (e.g. give an empty string). For now, it seems that the LLM gives a description about why the document is not relevant instead of gives an empty string.

Also, the prompt can be configured by the user. But we should try to give a solid baseline.

20001LastOrder · 2024-07-01T23:27:43Z

Also notice that this feature is about extracting relevant information from a document, instead of reranking them

…educed hallucination

Bubbletea98 · 2024-07-02T01:14:49Z

Hi @Eyobyb thank you so much for your comment. Improved prompting to reduce hallucinations and added a function to remove irrelevant answers.

amirfz · 2024-07-02T01:42:14Z

one possible implementation for the prompt could be that it ranks each item as not relevant (return empty string), partially relevant (extract only the relevant sentences as they are), or directly relevant (return the answer as is). it's likely that this semi chain of thought help the model do a better job in the refinement. plus we could use the tag to filter out anything that's not relevant, for example

…

On Mon, Jul 1, 2024, 21:15 Fandi Yi ***@***.***> wrote: Hi @Eyobyb <https://github.com/Eyobyb> thank you so much for your comment. Improved prompting to reduce hallucinations and added a function to remove irrelevant answers. — Reply to this email directly, view it on GitHub <#393 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AD4OK5MYJMJDUDMPPGT7KOLZKH5J7AVCNFSM6AAAAABJX7OQTWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBRGYYDANJRGY> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

amirfz · 2024-07-03T13:14:41Z

20001LastOrder · 2024-07-05T20:27:48Z

@Bubbletea98 Lets move ahead with the new refinery so we can move this PR forward!

…t will only come from original input sentences. 1. tokenize and index sentences 2. let llm pick the relevant sentences' index number. Updated test case in refinement.py as well.

Bubbletea98 · 2024-07-11T05:05:53Z

Hi Team, updated the refinement function:

Added a new refinement function: RefinementBySentence
- Tokenize input sentences and index each sentence
- let LLM pick relevant sentences' index number by checking question(query) and tokenized input answer.
Added a new test case for RefinementBySentence: check if the refined output is only from input sentences
Removed {k} variable ( max(# sentence) )from RefinementByQuery

Please let me know if there are any additional conditions we need to include. :)

Bubbletea98 added 2 commits June 22, 2024 23:24

added refinement action with test code: Idenfity the key answers by c…

efab4fb

…hecking input query with llm model

format code

9174c55

20001LastOrder requested review from Eyobyb and 20001LastOrder June 25, 2024 14:25

updated refinement function to consider 1. not relevent document 2. r…

c0fca62

…educed hallucination

Added new refinement funtion(RefinementBySentence): The refined outpu…

51b964f

…t will only come from original input sentences. 1. tokenize and index sentences 2. let llm pick the relevant sentences' index number. Updated test case in refinement.py as well.

Eyobyb approved these changes Jul 17, 2024

View reviewed changes

amirfz merged commit acf9916 into Aggregate-Intellect:main Jul 17, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search Refinement #393

Search Refinement #393

Bubbletea98 commented Jun 23, 2024 •

edited

Loading

20001LastOrder commented Jun 25, 2024

Eyobyb commented Jul 1, 2024

20001LastOrder commented Jul 1, 2024

20001LastOrder commented Jul 1, 2024

Bubbletea98 commented Jul 2, 2024

amirfz commented Jul 2, 2024 via email

amirfz commented Jul 3, 2024

20001LastOrder commented Jul 5, 2024

Bubbletea98 commented Jul 11, 2024 •

edited

Loading

Search Refinement #393

Search Refinement #393

Conversation

Bubbletea98 commented Jun 23, 2024 • edited Loading

Description

Type of change

Related issues

Checklists

Development

Code review

20001LastOrder commented Jun 25, 2024

Eyobyb commented Jul 1, 2024

20001LastOrder commented Jul 1, 2024

20001LastOrder commented Jul 1, 2024

Bubbletea98 commented Jul 2, 2024

amirfz commented Jul 2, 2024 via email

amirfz commented Jul 3, 2024

20001LastOrder commented Jul 5, 2024

Bubbletea98 commented Jul 11, 2024 • edited Loading

Bubbletea98 commented Jun 23, 2024 •

edited

Loading

Bubbletea98 commented Jul 11, 2024 •

edited

Loading