Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add multiple gsite feature #239

Merged

Conversation

YujingYang666777
Copy link
Contributor

  • modified SearchTool class in the tools.py file
  • added 4 unit testing
  • passed the integration testing

@20001LastOrder 20001LastOrder self-requested a review November 28, 2023 20:59
Copy link
Collaborator

@20001LastOrder 20001LastOrder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I think this is a great feature to have. I've left some comments.

if self.config.gsite:
gsite_list = self.config.gsite.split(", ")
gsite_list = [i for i in gsite_list if i != " " and i != "\n" and i != None]
if False in [validate_url(i) for i in gsite_list]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit duplicated as we are doing this check for every call of the search. At this point, the URL in the configuration should already be valid. I think we should put this check inside the AgentConfig. We can create a new attribute in the AgentConfig called search_domains as a list. Then, we should parse the gsite string into list of URLs in this method. Then we don't need to repeat this check here.

args={"error": f"The input URL is not valid"},
)
gsite_list = [query + " site:" + i for i in gsite_list]
if len(gsite_list) >= 5:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, this truncation can be done in the AgentConfig as well.


else:
gsite_list = [query]
top_k = int(10 / len(gsite_list))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make 10 as a parameter to the search method

gsite_list = self.config.gsite.split(", ")
gsite_list = [i for i in gsite_list if i != " " and i != "\n" and i != None]
if False in [validate_url(i) for i in gsite_list]:
return TaskAction(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we return TaskAction for error?

src/tests/unit_tests/tools/test_search_tool.py Outdated Show resolved Hide resolved
src/tests/unit_tests/tools/test_search_tool.py Outdated Show resolved Hide resolved
assert error == expected_error

def test_search_query_invalid_format():
site = "https://www.google.com,https://www.langchain.com,https://openai.com"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same above

)
assert search_result is not None
assert search_result == expected_result

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another test is to test if the results can be combined correctly. For this, we should do a Mock of the Google Search API. We can discuss this in more detail separately

src/tests/unit_tests/tools/test_search_tool.py Outdated Show resolved Hide resolved
src/sherpa_ai/config/task_config.py Show resolved Hide resolved
src/sherpa_ai/config/task_config.py Outdated Show resolved Hide resolved
src/sherpa_ai/config/task_config.py Outdated Show resolved Hide resolved
@20001LastOrder 20001LastOrder self-requested a review December 7, 2023 17:19
Copy link
Collaborator

@20001LastOrder 20001LastOrder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, merging...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants