add multiple gsite feature #239

YujingYang666777 · 2023-11-28T16:22:08Z

modified SearchTool class in the tools.py file
added 4 unit testing
passed the integration testing

20001LastOrder

Thanks for the PR. I think this is a great feature to have. I've left some comments.

20001LastOrder · 2023-11-29T17:49:45Z

src/sherpa_ai/tools.py

+        if self.config.gsite:
+            gsite_list = self.config.gsite.split(", ")
+            gsite_list = [i for i in gsite_list if i != " " and i != "\n" and i != None]
+            if False in [validate_url(i) for i in gsite_list]:


This is a bit duplicated as we are doing this check for every call of the search. At this point, the URL in the configuration should already be valid. I think we should put this check inside the AgentConfig. We can create a new attribute in the AgentConfig called search_domains as a list. Then, we should parse the gsite string into list of URLs in this method. Then we don't need to repeat this check here.

20001LastOrder · 2023-11-29T17:50:16Z

src/sherpa_ai/tools.py

+                args={"error": f"The input URL is not valid"},
+            )
+            gsite_list = [query + " site:" + i for i in gsite_list]
+            if len(gsite_list) >= 5:


Similarly, this truncation can be done in the AgentConfig as well.

20001LastOrder · 2023-11-29T17:50:53Z

src/sherpa_ai/tools.py

+
+        else:
+            gsite_list = [query]
+        top_k = int(10 / len(gsite_list))


We should make 10 as a parameter to the search method

20001LastOrder · 2023-11-29T17:51:14Z

src/sherpa_ai/tools.py

+            gsite_list = self.config.gsite.split(", ")
+            gsite_list = [i for i in gsite_list if i != " " and i != "\n" and i != None]
+            if False in [validate_url(i) for i in gsite_list]:
+                return TaskAction(


Why do we return TaskAction for error?

src/tests/unit_tests/tools/test_search_tool.py

20001LastOrder · 2023-11-29T17:53:46Z

src/tests/unit_tests/tools/test_search_tool.py

+    assert error == expected_error
+
+def test_search_query_invalid_format():
+    site = "https://www.google.com,https://www.langchain.com,https://openai.com"


20001LastOrder · 2023-11-29T17:54:36Z

src/tests/unit_tests/tools/test_search_tool.py

+                )
+    assert search_result is not None
+    assert search_result == expected_result
+


Another test is to test if the results can be combined correctly. For this, we should do a Mock of the Google Search API. We can discuss this in more detail separately

src/tests/unit_tests/tools/test_search_tool.py

src/sherpa_ai/config/task_config.py

20001LastOrder

Looks good to me now, merging...

YujingYang666777 and others added 4 commits November 23, 2023 08:15

changes tools to adept multiple gsites

2475d0f

add four unit testing for search tool

4c922e1

Merge branch 'main' into multiple_gsite

1856921

add validate url and unit testing

1d56bcc

20001LastOrder self-requested a review November 28, 2023 20:59

20001LastOrder requested changes Nov 29, 2023

View reviewed changes

change config class and adjust tests

f9f09fe

YujingYang666777 requested a review from 20001LastOrder December 6, 2023 21:32

Merge branch 'main' into multiple_gsite

f98dae1

20001LastOrder requested changes Dec 7, 2023

View reviewed changes

src/tests/unit_tests/tools/test_search_tool.py Outdated Show resolved Hide resolved

src/sherpa_ai/config/task_config.py Show resolved Hide resolved

src/sherpa_ai/config/task_config.py Outdated Show resolved Hide resolved

src/sherpa_ai/config/task_config.py Outdated Show resolved Hide resolved

address comment

ad2d09c

20001LastOrder self-requested a review December 7, 2023 17:19

20001LastOrder added 2 commits December 7, 2023 12:46

Cleanup the multi g-site config

583bbd0

Combine multiple gsites with citation validation

baea6b4

20001LastOrder approved these changes Dec 7, 2023

View reviewed changes

20001LastOrder merged commit 0a87757 into Aggregate-Intellect:main Dec 7, 2023

20001LastOrder deleted the multiple_gsite branch December 13, 2023 02:55

20001LastOrder mentioned this pull request Apr 4, 2024

[multiple gsite] adjust SearchTool and google search api to search the query within multiple gsite #10 #238

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add multiple gsite feature #239

add multiple gsite feature #239

YujingYang666777 commented Nov 28, 2023

20001LastOrder left a comment

20001LastOrder Nov 29, 2023

20001LastOrder Nov 29, 2023

20001LastOrder Nov 29, 2023

20001LastOrder Nov 29, 2023

20001LastOrder Nov 29, 2023

20001LastOrder Nov 29, 2023

20001LastOrder left a comment

add multiple gsite feature #239

add multiple gsite feature #239

Conversation

YujingYang666777 commented Nov 28, 2023

20001LastOrder left a comment

Choose a reason for hiding this comment

20001LastOrder Nov 29, 2023

Choose a reason for hiding this comment

20001LastOrder Nov 29, 2023

Choose a reason for hiding this comment

20001LastOrder Nov 29, 2023

Choose a reason for hiding this comment

20001LastOrder Nov 29, 2023

Choose a reason for hiding this comment

20001LastOrder Nov 29, 2023

Choose a reason for hiding this comment

20001LastOrder Nov 29, 2023

Choose a reason for hiding this comment

20001LastOrder left a comment

Choose a reason for hiding this comment