feat: begin work on RAG, refactoring, improved docs and tests #258

ErikBjare · 2024-11-15T12:43:33Z

Progress on #59

Replacement/alternative to #135

Important

Introduces a RAG tool for context-aware assistance with caching and context enhancement, including tests and documentation.

RAG Tool:
- Adds RAG tool in gptme/tools/rag.py for context-aware assistance with index, search, status commands.
- Requires gptme-rag package.
Caching:
- Implements Cache and RAGCache classes in gptme/cache.py for caching embeddings and search results.
- Adds get_cache() function for singleton cache instance.
Context Enhancement:
- Introduces RAGContextProvider in gptme/context.py for enhancing messages with relevant context using cached search results.
Configuration:
- Updates gptme/config.py to include RAG configuration options.
Documentation:
- Updates docs/tools.rst to document the RAG tool.
Testing:
- Adds tests for caching in tests/test_cache.py.
- Adds tests for context enhancement in tests/test_context.py.
- Adds tests for RAG tool functionality in tests/test_tools_rag.py.
Build Process:
- Modifies .github/workflows/build.yml to include poetry install -E server -E browser.

^{This description was created by}^{for 0e1596d. It will automatically update as commits are pushed.}

ellipsis-dev

👍 Looks good to me! Reviewed everything up to 59d0e70 in 54 seconds

More details

Looked at 977 lines of code in 12 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/tools/rag.py:15

Draft comment:
The ImportError handling is incorrect. _HAS_RAG should be set to False in the except block, not True.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is incorrect because the code is correctly setting _HAS_RAG to True when the import is successful and to False when an ImportError occurs. The comment seems to misunderstand the logic of the try-except block.
I might be missing some context about how the import should be handled, but based on the provided code, the logic seems correct. The comment might be based on a misunderstanding of the try-except pattern.
The try-except pattern is a common way to handle optional imports, and the current implementation aligns with this pattern. The comment does not provide strong evidence of an issue.
The comment is incorrect because the code correctly handles the ImportError by setting _HAS_RAG to False in the except block. The comment should be deleted.

2. gptme/tools/rag.py:25

Draft comment:
Ensure _HAS_RAG is True before using indexer to avoid runtime errors when gptme-rag is not installed.
Reason this comment was not posted:
Comment did not seem useful.

3. gptme/tools/rag.py:52

Draft comment:
Check if indexer is initialized before returning the ToolSpec to ensure the tool is only available when properly initialized.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_nIjTgxlmloWKMUAw

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ErikBjare · 2024-11-15T12:46:17Z

tests/test_cache.py

+@pytest.mark.asyncio
+async def test_rag_context_provider_with_cache():


unnecessary async

ellipsis-dev

❌ Changes requested. Incremental review on 445e49a in 39 seconds

More details

Looked at 104 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. gptme/config.py:125

Draft comment:
The change from load_config to _load_config is consistent with the refactoring and aligns with the new naming convention.
Reason this comment was not posted:
Confidence changes required: 20%
The change from load_config to _load_config in set_config_value is consistent with the refactoring done in the PR. This change ensures that the internal function _load_config is used, which aligns with the new naming convention.

2. gptme/init.py:33

Draft comment:
The change from load_config to get_config ensures the singleton pattern is respected.
Reason this comment was not posted:
Confidence changes required: 20%
The change from load_config to get_config in init.py is consistent with the refactoring done in the PR. This change ensures that the singleton pattern is respected by using get_config.

Workflow ID: wflow_ASFJtIEdCL9ofNR3

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-11-15T12:54:03Z

gptme/config.py



+@lru_cache


Consider specifying a maxsize for @lru_cache to prevent unbounded memory usage.

Suggested change

@lru_cache

@lru_cache(maxsize=128)

ellipsis-dev

👍 Looks good to me! Incremental review on 5cde0aa in 14 seconds

More details

Looked at 37 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. tests/test_config.py:6

Draft comment:
Avoid using print statements in tests. They can clutter test output. Use assertions to verify conditions instead.
Reason this comment was not posted:
Confidence changes required: 50%
The use of print statements in tests is not a best practice. It can clutter test output and is not useful for automated testing. Instead, assertions should be used to verify test conditions.

Workflow ID: wflow_Ww3jTwyPfU6D42kA

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

Skipped PR review on b1874f9 because no changed files had a supported extension. If you think this was in error, please contact us and we'll fix it right away.

codecov-commenter · 2024-11-15T14:52:10Z

Codecov Report

Attention: Patch coverage is 85.56701% with 28 lines in your changes missing coverage. Please review.

Project coverage is 73.30%. Comparing base (ffc6d0b) to head (0e1596d).

✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
gptme/tools/_rag_context.py	84.04%	15 Missing ⚠️
gptme/tools/computer.py	36.84%	12 Missing ⚠️
gptme/tools/base.py	87.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #258      +/-   ##
==========================================
+ Coverage   72.40%   73.30%   +0.90%     
==========================================
  Files          62       64       +2     
  Lines        4055     4211     +156     
==========================================
+ Hits         2936     3087     +151     
- Misses       1119     1124       +5

Flag	Coverage Δ
anthropic/claude-3-haiku-20240307	`72.40% <85.56%> (+0.93%)`	⬆️
openai/gpt-4o-mini	`72.26% <85.56%> (+0.89%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚨 Try these New Features:

JS Bundle Analysis - Avoid shipping oversized bundles

ellipsis-dev

👍 Looks good to me! Incremental review on f9a4f96 in 20 seconds

More details

Looked at 13 lines of code in 1 files
Skipped 1 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. pyproject.toml:40

Draft comment:
The version update for gptme-rag from ^0.1.0 to ^0.1.4 is correct and aligns with the PR description. No issues found here.
Reason this comment was not posted:
Confidence changes required: 0%
The version update for gptme-rag in pyproject.toml is correct and aligns with the PR description. No issues found here.

Workflow ID: wflow_BfhLBsemjc5MhuY0

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Incremental review on b927965 in 34 seconds

More details

Looked at 124 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/rag.py:99

Draft comment:
The init function is missing the instructions, examples, block_types, and execute fields in the returned ToolSpec. Ensure these fields are included for complete tool initialization.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_FjAMAm4OlNLOZtfa

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

gptme/tools/rag.py

ellipsis-dev

👍 Looks good to me! Incremental review on c6e5530 in 16 seconds

More details

Looked at 131 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/tools/rag.py:13

Draft comment:
Import statements should be at the top of the file, following PEP 8 guidelines. Consider moving the gptme_rag import to the top, outside of the try-except block.
Reason this comment was not posted:
Confidence changes required: 50%
The import statement for gptme_rag should be at the top of the file, following PEP 8 guidelines for import order.

2. gptme/tools/rag.py:20

Draft comment:
Type hint for indexer should not be a string. Use gptme_rag.Indexer | None instead of "gptme_rag.Indexer | None".
Reason this comment was not posted:
Confidence changes required: 50%
The type hint for indexer should not be a string. It should be a direct type hint.

3. gptme/tools/rag.py:81

Draft comment:
Consider handling the case where config is None to avoid potential issues. Currently, if config is None, the function will return tool without initializing indexer.
Reason this comment was not posted:
Confidence changes required: 50%
The init function should handle the case where config is None to avoid potential issues.

Workflow ID: wflow_W8HJV0gblxp2QVDk

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 622a451 in 19 seconds

More details

Looked at 137 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/tools/rag.py:85

Draft comment:
The import statement for gptme_rag is repeated here. Since it's already imported at the top, this is redundant. Consider removing this import statement.
Reason this comment was not posted:
Confidence changes required: 50%
The import statement for gptme_rag is repeated unnecessarily.

2. gptme/tools/rag.py:20

Draft comment:
The indexer variable is used globally, which can lead to issues in a multi-threaded environment. Consider using a more thread-safe approach, such as encapsulating it within a class or using thread-local storage.
Reason this comment was not posted:
Confidence changes required: 50%
The indexer variable is used globally but is not thread-safe.

3. gptme/tools/rag.py:81

Draft comment:
The init function does not handle the case where config is None. Consider adding a check to handle this scenario to avoid potential errors.
Reason this comment was not posted:
Confidence changes required: 50%
The init function should handle the case where config is None.

Workflow ID: wflow_qijydExN2iKC0R8a

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 07b55d4 in 11 seconds

More details

Looked at 14 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_bzDE3gbCWiJN7DJV

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 1494f33 in 19 seconds

More details

Looked at 109 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. scripts/Dockerfile:23

Draft comment:
Ensure that server dependencies are installed in the builder stage to avoid runtime errors. Consider adding RUN /opt/poetry/bin/poetry install -E server --without dev before building the wheel.
Reason this comment was not posted:
Comment did not seem useful.

2. scripts/Dockerfile.server:7

Draft comment:
Copying requirements-server.txt from the base image is unnecessary if the base image already includes server dependencies. Consider removing this step to simplify the Dockerfile.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is speculative because it assumes the base image might already have the server dependencies, which is not evident from the diff. The comment does not provide strong evidence that the change is unnecessary. Without clear evidence, the comment should be deleted as it does not meet the criteria for a useful comment.
I might be missing some context about the base image, but the comment does not provide any evidence or reasoning to support its claim. It is speculative and not actionable.
Even if there is missing context, the comment should provide evidence or reasoning to be considered useful. Without it, the comment remains speculative.
Delete the comment as it is speculative and does not provide strong evidence that the change is unnecessary.

Workflow ID: wflow_sXVvSe3L6zATbm0D

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on b1d5408 in 20 seconds

More details

Looked at 14 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. tests/test_tools_rag.py:10

Draft comment:
The import statement for init_rag was changed to init, but the function is still referred to as init_rag in the test cases. Consider renaming the function in the test cases to init for consistency.
Reason this comment was not posted:
Comment looked like it was already resolved.

Workflow ID: wflow_JiRzlGOwC5U1wuuc

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on f2bd6a4 in 19 seconds

More details

Looked at 218 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. gptme/tools/rag.py:89

Draft comment:
The examples string uses f-string formatting without any variables. Consider using a regular string instead.
Reason this comment was not posted:
Confidence changes required: 50%
The examples string in rag.py uses f-string formatting but does not include any variables to format. This is unnecessary and could be simplified.

2. gptme/tools/rag.py:166

Draft comment:
Ensure tool.get_doc(__doc__) correctly updates the module docstring as intended.
Reason this comment was not posted:
Confidence changes required: 30%
The __doc__ assignment at the end of the file is using tool.get_doc(__doc__), which seems to be a method to update the docstring. Ensure this is intended and correctly updates the docstring.

Workflow ID: wflow_U0AJuZQ840iSc2P8

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 31cfbc2 in 24 seconds

More details

Looked at 137 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/computer.py:2

Draft comment:
The documentation for the computer tool was removed from docs/tools.rst, but similar content is still present here. Consider removing or updating this section for consistency.
Reason this comment was not posted:
Confidence changes required: 50%
The PR description mentions the removal of documentation for the computer tool in docs/tools.rst, but the content is still present in gptme/tools/computer.py. This is inconsistent and could lead to confusion.

Workflow ID: wflow_SYCHkZjCfRz051G2

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Incremental review on 075c765 in 25 seconds

More details

Looked at 246 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. gptme/tools/computer.py:131

Draft comment:
Consider adding a check for out-of-bounds coordinates when the source is _ScalingSource.COMPUTER to ensure robustness.
Reason this comment was not posted:
Confidence changes required: 50%
The _scale_coordinates function in gptme/tools/computer.py checks if the coordinates are out of bounds only when the source is _ScalingSource.API. However, it should also check for out-of-bounds coordinates when the source is _ScalingSource.COMPUTER to ensure robustness.

Workflow ID: wflow_vhzgCY0Owfu7Z0mb

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-11-17T16:45:03Z

gptme/tools/computer.py

@@ -167,16 +217,16 @@ def computer(

        if shutil.which("gnome-screenshot"):
            # FIXME: incorrect call to xdotool


Address the FIXME comment regarding the incorrect call to xdotool for taking screenshots to ensure functionality works as intended.

ellipsis-dev

👍 Looks good to me! Incremental review on aa57b67 in 15 seconds

More details

Looked at 13 lines of code in 1 files
Skipped 1 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. pyproject.toml:39

Draft comment:
Ensure that the updated gptme-rag version ^0.1.5 is compatible with the rest of the codebase and does not introduce any breaking changes.
Reason this comment was not posted:
Confidence changes required: 50%
The version update for gptme-rag in pyproject.toml is correct, but it's important to ensure that the new version is compatible with the rest of the codebase.

Workflow ID: wflow_UTbZ3tupOiPJksUU

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 2af98f1 in 26 seconds

More details

Looked at 102 lines of code in 3 files
Skipped 0 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/util/__init__.py:374

Draft comment:
The get_project_dir function is duplicated. Consider removing one instance to avoid redundancy.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment claims duplication, but the provided file content does not show any duplicate get_project_dir function. Without evidence of duplication in the diff or the file content, the comment seems incorrect. The comment does not meet the criteria for a valid review comment as it lacks strong evidence.
I might be missing context from other files or parts of the codebase that are not visible here. However, the task is to focus on the provided diff and file content, and based on that, the comment seems incorrect.
The task is to review the comment based on the provided diff and file content. Since there is no evidence of duplication in the visible content, the comment should be considered incorrect.
The comment should be deleted as there is no evidence of the get_project_dir function being duplicated in the provided file content.

2. gptme/prompts.py:19

Draft comment:
The get_project_dir function is duplicated. Consider removing one instance to avoid redundancy.
Reason this comment was not posted:
Marked as duplicate.

3. gptme/tools/rag.py:66

Draft comment:
The get_project_dir function is duplicated. Consider removing one instance to avoid redundancy.
Reason this comment was not posted:
Marked as duplicate.

Workflow ID: wflow_Vcrs1pVFASX3vtl7

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

tests/test_tools_rag.py

ErikBjare · 2024-11-17T19:35:58Z

Alright, finally going to merge this.

Not super happy with how complicated it got, but I'll save the refactor for a future PR to avoid this one getting stuck with a bunch of good refactors and fixes.

ellipsis-dev

👍 Looks good to me! Incremental review on 5b41de3 in 22 seconds

More details

Looked at 375 lines of code in 5 files
Skipped 0 files when reviewing.
Skipped posting 5 drafted comments based on config settings.

1. gptme/tools/rag.py:110

Draft comment:
The rag_index function returns the number of paths indexed, not the actual number of documents. Consider updating the return message to reflect the number of documents indexed for clarity.
Reason this comment was not posted:
Confidence changes required: 50%
The rag_index function returns a message indicating the number of paths indexed, but it doesn't account for the actual number of documents indexed. This could be misleading if a path contains multiple documents. The message should reflect the number of documents indexed instead.

2. gptme/tools/rag.py:118

Draft comment:
The rag_search function truncates document content to 200 characters. Consider making this limit configurable to allow flexibility based on user needs.
Reason this comment was not posted:
Confidence changes required: 33%
The rag_search function truncates document content to 200 characters, which might not be sufficient for some use cases. It would be better to make this limit configurable.

3. gptme/tools/rag.py:125

Draft comment:
Ensure that indexer.collection.count() accurately reflects the number of documents indexed, as it might not account for all edge cases.
Reason this comment was not posted:
Confidence changes required: 33%
The rag_status function uses indexer.collection.count() to get the number of documents. Ensure that this method accurately reflects the number of documents indexed, as it might not account for all edge cases.

4. gptme/tools/rag.py:135

Draft comment:
The default collection name is set to "default". Consider using a more descriptive default name or making it configurable to avoid potential conflicts or confusion.
Reason this comment was not posted:
Confidence changes required: 33%
The init function in gptme/tools/rag.py initializes the indexer with a default collection name of "default". It might be better to use a more descriptive default name or ensure it's configurable.

5. tests/test_tools_rag.py:65

Draft comment:
The FIXME comment indicates a performance issue when indexing the default path. Consider addressing this to ensure the test runs efficiently and doesn't slow down the test suite.
Reason this comment was not posted:
Confidence changes required: 50%
The test_rag_index_function test case has a FIXME comment indicating a performance issue when indexing the default path. This should be addressed to ensure the test runs efficiently.

Workflow ID: wflow_pNnkmPgMsnbDPZeI

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 3346767 in 25 seconds

More details

Looked at 375 lines of code in 5 files
Skipped 0 files when reviewing.
Skipped posting 4 drafted comments based on config settings.

1. gptme/tools/rag.py:110

Draft comment:
The rag_index function returns "Indexed 1 paths" even when multiple paths are indexed. Consider updating the return message to reflect the actual number of paths indexed.

    return f"Indexed {len(paths)} path{'s' if len(paths) > 1 else ''}"

Reason this comment was not posted:
Confidence changes required: 50%
The rag_index function returns a message indicating the number of paths indexed, but it doesn't account for the possibility of indexing multiple paths. The message should reflect the actual number of paths indexed.

2. gptme/tools/rag.py:118

Draft comment:
The rag_search function truncates document content to 200 characters. Consider making this limit configurable to allow flexibility in the amount of content returned.

        f"### {doc.metadata['source']}\n{doc.content[:200]}..." for doc in docs

Reason this comment was not posted:
Confidence changes required: 50%
The rag_search function truncates document content to 200 characters, which might not be sufficient for some use cases. It would be better to make this limit configurable.

3. gptme/tools/rag.py:131

Draft comment:
The init function returns tool even when _HAS_RAG is False. Consider returning None or raising an exception to avoid unexpected behavior when RAG is not available.

        return None

Reason this comment was not posted:
Confidence changes required: 50%
The init function in gptme/tools/rag.py returns tool even when _HAS_RAG is False, which might lead to unexpected behavior. It should return None or raise an exception when RAG is not available.

4. tests/test_tools_rag.py:65

Draft comment:
The test_rag_index_function test case has a FIXME comment about performance issues when indexing the default path. Consider optimizing the test setup or environment to avoid slow execution.

        # Consider optimizing the test environment to avoid slow execution

Reason this comment was not posted:
Confidence changes required: 50%
The test_rag_index_function test case has a FIXME comment indicating a performance issue when indexing the default path. This should be addressed to ensure the test runs efficiently.

Workflow ID: wflow_9dsJUOWSBxKcTAIz

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

gptme/tools/rag.py

…fixed tests

ellipsis-dev

👍 Looks good to me! Incremental review on 26aeced in 30 seconds

More details

Looked at 1221 lines of code in 16 files
Skipped 1 files when reviewing.
Skipped posting 5 drafted comments based on config settings.

1. gptme/tools/rag.py:104

Draft comment:
Consider separating the number of paths and documents in the return string for clarity. For example, Indexed {len(paths)} paths with {total_docs} documents.
Reason this comment was not posted:
Confidence changes required: 50%
The rag_index function in gptme/tools/rag.py returns a string indicating the number of paths indexed, but it also includes the number of documents indexed. This could be misleading if the number of paths and documents differ significantly. It would be clearer to separate these two pieces of information.

2. gptme/tools/rag.py:112

Draft comment:
Consider slicing the document content at a word boundary or providing a more contextually complete snippet to avoid cutting off important information mid-sentence.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/rag.py, the rag_search function slices the document content to 200 characters. This could potentially cut off important information mid-sentence. It might be better to slice at a word boundary or provide a more contextually complete snippet.

3. gptme/tools/rag.py:135

Draft comment:
Consider adding validation for index_path and collection to ensure they are valid and accessible before initializing RAGManager.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/rag.py, the init function initializes the RAGManager without checking if the index_path and collection are valid or accessible. This could lead to runtime errors if the paths are incorrect or inaccessible.

4. gptme/tools/_rag_context.py:26

Draft comment:
Consider implementing a size limit for _search_cache or using a more sophisticated caching mechanism to prevent high memory usage.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/_rag_context.py, the _search_cache is a simple in-memory cache. This could lead to high memory usage if the cache grows large. Consider implementing a size limit or using a more sophisticated caching mechanism.

5. gptme/tools/_rag_context.py:77

Draft comment:
Consider adding exception handling around the initialization of Indexer and ContextAssembler to handle potential issues with configuration or dependencies.
Reason this comment was not posted:
Confidence changes required: 50%
In gptme/tools/_rag_context.py, the RAGManager class initializes the Indexer and ContextAssembler without handling potential exceptions that could arise during their initialization. This could lead to unhandled exceptions if there are issues with the configuration or dependencies.

Workflow ID: wflow_bbmVLNV5C5ez5Cpk

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 2e4cff8 in 46 seconds

More details

Looked at 223 lines of code in 3 files
Skipped 1 files when reviewing.
Skipped posting 3 drafted comments based on config settings.

1. gptme/tools/_rag_context.py:89

Draft comment:
Ensure max_tokens for ContextAssembler is configurable through the RAGManager constructor or configuration file to allow flexibility for different use cases.
Reason this comment was not posted:
Confidence changes required: 50%
The RAGManager class in gptme/tools/_rag_context.py initializes the indexer and context_assembler with configurations. However, the max_tokens parameter for context_assembler is set to a default of 2000, which might not be suitable for all use cases. It would be better to ensure this value is configurable through the RAGManager constructor or configuration file.

2. pyproject.toml:42

Draft comment:
Ensure that the optional gptme-rag dependency is handled properly in the code to avoid runtime errors when it is not installed.
Reason this comment was not posted:
Confidence changes required: 50%
In pyproject.toml, the gptme-rag dependency is marked as optional. However, the code in gptme/tools/_rag_context.py raises an ImportError if gptme-rag is not installed. This could lead to runtime errors if the optional dependency is not handled properly. Consider adding a check or fallback mechanism to handle cases where gptme-rag is not installed.

3. tests/test_tools_rag.py:28

Draft comment:
Ensure that shutil.rmtree in fixtures only deletes temporary directories to prevent accidental data loss.
Reason this comment was not posted:
Confidence changes required: 50%
In tests/test_tools_rag.py, the index_path and temp_docs fixtures use shutil.rmtree to clean up temporary directories. This is a good practice to ensure test isolation and prevent side effects. However, it's important to ensure that these directories are not accidentally deleted if they contain important data. Consider adding a check to ensure the path is indeed a temporary path before deletion.

Workflow ID: wflow_btH4zyxCx4Yqp15Y

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

❌ Changes requested. Incremental review on d403e54 in 43 seconds

More details

Looked at 224 lines of code in 3 files
Skipped 1 files when reviewing.
Skipped posting 2 drafted comments based on config settings.

1. tests/test_tools_rag.py:43

Draft comment:
shutil.rmtree is unnecessary here as tmp_path is automatically cleaned up by pytest.
Reason this comment was not posted:
Marked as duplicate.

2. tests/test_tools_rag.py:21

Draft comment:
Consider moving pytest.importorskip("gptme_rag") to the top of the file or inside test functions to ensure it executes before any test logic.
Reason this comment was not posted:
Confidence changes required: 50%
The pytest.importorskip function is used to skip tests if the gptme_rag module is not available. However, this should be placed inside the test functions or at the top of the test file to ensure it is executed before any test logic, ensuring the tests are skipped appropriately.

Workflow ID: wflow_H373CnmkkDw5PcXk

Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev · 2024-11-19T10:49:03Z

tests/test_tools_rag.py

+def index_path(tmp_path):
+    """Create a temporary index path."""
+    yield tmp_path
+    shutil.rmtree(tmp_path, ignore_errors=True)


shutil.rmtree is unnecessary here as tmp_path is automatically cleaned up by pytest.

Suggested change

shutil.rmtree(tmp_path, ignore_errors=True)

ellipsis-dev

👍 Looks good to me! Incremental review on 75787da in 28 seconds

More details

Looked at 13 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. .github/workflows/build.yml:185

Draft comment:
Ensure that the extras specified in the matrix (currently '-E all') align with the changes made in the PR (i.e., '-E server -E browser'). This ensures consistency across build configurations.
Reason this comment was not posted:
Comment did not seem useful.

Workflow ID: wflow_5QHI1wvbsKzObX9o

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on 57e49e7 in 16 seconds

More details

Looked at 45 lines of code in 2 files
Skipped 1 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. tests/test_tools_rag.py:28

Draft comment:
shutil.rmtree is unnecessary here as tmp_path is automatically cleaned up by pytest.
Reason this comment was not posted:
Confidence changes required: 10%
The use of shutil.rmtree in the fixture is unnecessary since tmp_path is automatically cleaned up by pytest. This is a minor issue but worth noting for code cleanliness.

Workflow ID: wflow_SJ9OL3vjBqHLZHOC

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev

👍 Looks good to me! Incremental review on ea0bc6d in 20 seconds

More details

Looked at 45 lines of code in 2 files
Skipped 1 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. tests/test_tools_rag.py:28

Draft comment:
The use of shutil.rmtree is unnecessary here as tmp_path is automatically cleaned up by pytest.
Reason this comment was not posted:
Confidence changes required: 50%
The use of shutil.rmtree in the fixture is unnecessary since tmp_path is automatically cleaned up by pytest.

Workflow ID: wflow_8Ms1yiXzlifXTPrD

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ErikBjare · 2024-11-19T11:13:54Z

Not super happy with this, but too many misc improvements not to merge, and the CI should finally pass now.

ellipsis-dev

👍 Looks good to me! Incremental review on 0e1596d in 25 seconds

More details

Looked at 40 lines of code in 1 files
Skipped 0 files when reviewing.
Skipped posting 1 drafted comments based on config settings.

1. tests/test_cli.py:287

Draft comment:
The order of assertions in test_chain seems incorrect. The assertion assert user1_loc < user2_loc should be assert user1_loc < save_loc to ensure the save operation occurs after the first user input. Similarly, assert patch_loc < user2_loc should be assert save_loc < user2_loc to ensure the user input follows the save operation. Please review the sequence of operations and adjust the assertions accordingly.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment suggests changes that are already implemented in the code. The assertions in the code already ensure the correct order of operations as described by the comment. Therefore, the comment is not useful as it does not suggest any new or necessary changes.
I might be misunderstanding the comment's intention or the current code logic. It's possible that the comment is pointing out a subtle issue that I missed.
The assertions in the code already ensure the correct sequence of operations, as described by the comment. The comment does not provide any new insights or necessary changes.
The comment should be deleted because it suggests changes that are already implemented in the code, making it redundant and not useful.

Workflow ID: wflow_wuOosAP1fExMWoqI

You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

ellipsis-dev bot reviewed Nov 15, 2024

View reviewed changes

ErikBjare commented Nov 15, 2024

View reviewed changes

ellipsis-dev bot reviewed Nov 15, 2024

View reviewed changes

ErikBjare force-pushed the dev/implement-rag branch from 445e49a to 5cde0aa Compare November 15, 2024 14:47

ellipsis-dev bot reviewed Nov 15, 2024

View reviewed changes

gptme/tools/rag.py Show resolved Hide resolved

ErikBjare force-pushed the dev/implement-rag branch from b927965 to c6e5530 Compare November 15, 2024 15:34

ellipsis-dev bot reviewed Nov 15, 2024

View reviewed changes

ErikBjare force-pushed the dev/implement-rag branch from c6e5530 to 622a451 Compare November 17, 2024 15:50

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

ErikBjare force-pushed the dev/implement-rag branch from 1494f33 to b1d5408 Compare November 17, 2024 16:25

ErikBjare added 7 commits November 17, 2024 17:25

feat: started working on RAG (again)

92b175a

fix: fixed bugs in rag

9961166

fix: fixed tests after refactor

e485c5a

build(deps): updated dependencies

6cf10be

build(deps): updated gptme-rag

82e4ce1

fix: made rag support optional

43452b4

test: fixed tests

b1d5408

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

docs: fixed docs

f2bd6a4

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

docs: fixed docs for computer tool

31cfbc2

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

docs: fixed docs, made some funcs private to hide them from autodocs

075c765

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

fix: add get_project_dir helper function

2af98f1

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

ErikBjare commented Nov 17, 2024

View reviewed changes

tests/test_tools_rag.py Show resolved Hide resolved

ErikBjare force-pushed the dev/implement-rag branch from df2374b to 2af98f1 Compare November 17, 2024 19:33

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

ErikBjare force-pushed the dev/implement-rag branch from 5b41de3 to 3346767 Compare November 17, 2024 20:24

fix: changed rag tool to use functions instead of execute

3346767

ellipsis-dev bot reviewed Nov 17, 2024

View reviewed changes

ErikBjare commented Nov 17, 2024

View reviewed changes

gptme/tools/rag.py Outdated Show resolved Hide resolved

ErikBjare added 2 commits November 17, 2024 21:26

Apply suggestions from code review

5e053f0

fix: more refactoring of rag, made typechecking of imports stricter, …

26aeced

…fixed tests

ellipsis-dev bot reviewed Nov 18, 2024

View reviewed changes

ellipsis-dev bot reviewed Nov 19, 2024

View reviewed changes

fix: fixed rag tool tests

d403e54

ErikBjare force-pushed the dev/implement-rag branch from 2e4cff8 to d403e54 Compare November 19, 2024 10:48

ellipsis-dev bot reviewed Nov 19, 2024

View reviewed changes

ci: fixed typechecking

75787da

ellipsis-dev bot reviewed Nov 19, 2024

View reviewed changes

fix: hopefully finally fixed rag tests now

ea0bc6d

ErikBjare force-pushed the dev/implement-rag branch from 57e49e7 to ea0bc6d Compare November 19, 2024 11:10

ellipsis-dev bot reviewed Nov 19, 2024

View reviewed changes

test: made test_chain more reliable for gpt-4o-mini

0e1596d

ellipsis-dev bot reviewed Nov 19, 2024

View reviewed changes

ErikBjare changed the title ~~feat: started working on RAG (again)~~ feat: begin work on RAG, refactoring, improved docs and tests Nov 19, 2024

ErikBjare merged commit 950c397 into master Nov 19, 2024
7 checks passed

ErikBjare deleted the dev/implement-rag branch November 19, 2024 11:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: begin work on RAG, refactoring, improved docs and tests #258

feat: begin work on RAG, refactoring, improved docs and tests #258

ErikBjare commented Nov 15, 2024 •

edited by ellipsis-dev bot

Loading

ellipsis-dev bot left a comment

ErikBjare Nov 15, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot Nov 15, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

codecov-commenter commented Nov 15, 2024 •

edited

Loading

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot Nov 17, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ErikBjare commented Nov 17, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot Nov 19, 2024

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ellipsis-dev bot left a comment

ErikBjare commented Nov 19, 2024

ellipsis-dev bot left a comment

		@pytest.mark.asyncio
		async def test_rag_context_provider_with_cache():

		@@ -167,16 +217,16 @@ def computer(

		if shutil.which("gnome-screenshot"):
		# FIXME: incorrect call to xdotool

feat: begin work on RAG, refactoring, improved docs and tests #258

feat: begin work on RAG, refactoring, improved docs and tests #258

Conversation

ErikBjare commented Nov 15, 2024 • edited by ellipsis-dev bot Loading

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare Nov 15, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Nov 15, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

codecov-commenter commented Nov 15, 2024 • edited Loading

Codecov Report

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Nov 17, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare commented Nov 17, 2024

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot Nov 19, 2024

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare commented Nov 19, 2024

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

ErikBjare commented Nov 15, 2024 •

edited by ellipsis-dev bot

Loading

codecov-commenter commented Nov 15, 2024 •

edited

Loading