Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tools: unify retrievers/functions and add file tools #164

Merged
merged 20 commits into from
Jun 6, 2024
Merged

Conversation

lusmoura
Copy link
Collaborator

@lusmoura lusmoura commented Jun 3, 2024

This is the first of a series of PRs to introduce multihop to the toolkit.

  • Removes the duplicate logic for retrievers and function tools from custom chat, since all retrievers are tools for multihop.
  • Adds both read_file and search_document tools to support uploaded files and tools simultaneously.
  • The tools mentioned above required some changes to the tool calls, such as passing the model deployment (to rerank the chunks) and the user id (to query the DB for files).
  • The tools also required a change to the chat_history to add an utterance containing the available files before generating the tool calls.

Looking for review for chunking documents and manipulating chat_history.

AI Description

This PR introduces two new tools, ReadFileTool and SearchFileTool, which allow for reading and searching within uploaded files.

The changes include:

  • Addition of ReadFileTool and SearchFileTool classes in src/backend/tools/files.py.
  • Updates to src/backend/chat/custom/custom.py to handle the new tools and add file content to the chat history.
  • Updates to src/backend/config/tools.py to include the new tools and adjust existing tool definitions.
  • Addition of get_files_by_file_names function in src/backend/crud/file.py to retrieve files by their names.
  • Addition of file_content field in src/backend/database_models/file.py to store file content.
  • Updates to src/backend/model_deployments/base.py, src/backend/model_deployments/azure.py, src/backend/model_deployments/bedrock.py, and src/backend/model_deployments/cohere_platform.py to include chat_history in the invoke_tools method.
  • Updates to src/backend/routers/conversation.py to read file content during file upload.
  • Addition of SYSTEM role in src/backend/schemas/chat.py and src/backend/schemas/cohere_chat.py to identify system messages.
  • Addition of tool_results field in src/backend/schemas/cohere_chat.py to store results from invoking tools.
  • Addition of imports for ReadFileTool and SearchFileTool in src/backend/tools/__init__.py.
  • Update to poetry.lock to include the new dependency pypdf.

@lusmoura lusmoura marked this pull request as ready for review June 4, 2024 11:29
@lusmoura lusmoura changed the title [WIP] Tools: unify retrievers/functions and add file tools Tools: unify retrievers/functions and add file tools Jun 4, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jun 4, 2024

Codecov Report

Attention: Patch coverage is 70.68273% with 73 lines in your changes missing coverage. Please review.

Please upload report for BASE (main@d0459af). Learn more about missing BASE report.

Files Patch % Lines
src/backend/chat/custom/custom.py 53.03% 31 Missing ⚠️
src/backend/tools/files.py 48.00% 26 Missing ⚠️
src/backend/chat/collate.py 84.90% 8 Missing ⚠️
src/backend/model_deployments/azure.py 20.00% 4 Missing ⚠️
src/backend/alembic/versions/f5819b10ef2a_.py 90.90% 1 Missing ⚠️
src/backend/crud/file.py 50.00% 1 Missing ⚠️
src/backend/model_deployments/bedrock.py 50.00% 1 Missing ⚠️
src/backend/model_deployments/sagemaker.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #164   +/-   ##
=======================================
  Coverage        ?   86.76%           
=======================================
  Files           ?      128           
  Lines           ?     4140           
  Branches        ?        0           
=======================================
  Hits            ?     3592           
  Misses          ?      548           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/backend/chat/custom/custom.py Outdated Show resolved Hide resolved
src/backend/tools/files.py Outdated Show resolved Hide resolved
src/backend/chat/collate.py Outdated Show resolved Hide resolved
src/backend/chat/custom/custom.py Outdated Show resolved Hide resolved
src/backend/chat/custom/custom.py Outdated Show resolved Hide resolved
src/backend/chat/collate.py Outdated Show resolved Hide resolved
src/backend/chat/collate.py Show resolved Hide resolved
src/backend/chat/collate.py Outdated Show resolved Hide resolved
src/backend/tools/files.py Show resolved Hide resolved
src/backend/chat/collate.py Outdated Show resolved Hide resolved
src/backend/chat/collate.py Outdated Show resolved Hide resolved
src/backend/chat/custom/custom.py Outdated Show resolved Hide resolved
Copy link
Contributor

@scott-cohere scott-cohere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, just a few nits

src/backend/chat/collate.py Outdated Show resolved Hide resolved
src/backend/chat/collate.py Show resolved Hide resolved
@lusmoura lusmoura merged commit 42253c9 into main Jun 6, 2024
2 checks passed
@lusmoura lusmoura deleted the luisa/file_tool branch June 6, 2024 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants