Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: tokenizer bug fix in codeanalysis #988

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

shreysingla11
Copy link
Collaborator

@shreysingla11 shreysingla11 commented Dec 11, 2024

Important

Update transformers version and remove unused dependencies to fix tokenizer bug in code analysis.

  • Dependencies:
    • Update transformers version to >=4.46.3,<4.47 in setup.py to fix tokenizer bug.
    • Remove tree_sitter, sentence_transformers, and tree_sitter_languages from tox.ini dependencies.

This description was created by Ellipsis for 7011538. It will automatically update as commits are pushed.

Copy link

vercel bot commented Dec 11, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
composio ✅ Ready (Inspect) Visit Preview 💬 Add feedback Dec 11, 2024 1:38pm

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to b35bf8a in 21 seconds

More details
  • Looked at 35 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. python/tox.ini:98
  • Draft comment:
    The version constraint for 'transformers' is updated in setup.py but not reflected here. Ensure consistency across files.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_DCUeUkVPmvH4fOoV


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@shreysingla11 shreysingla11 changed the title tokenizer bug fix in codeanalysis fix: tokenizer bug fix in codeanalysis Dec 11, 2024
@@ -91,16 +91,11 @@ deps =
codecov==2.1.13
pytest-codecov==0.5.1
typing_extensions>=4.10.0
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of these dependencies (tree_sitter, deeplake, jedi, sentence_transformers, tree_sitter_languages) from tox.ini is concerning as they are listed as required dependencies in the CodeAnalysisTool class. Please ensure these dependencies are properly managed elsewhere or document why they are being removed.

@@ -77,7 +77,7 @@ def scan_for_package_data(
"networkx",
"ruff",
"flake8",
"transformers",
"transformers>=4.46.3,<4.47",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good practice to pin the transformers version. Consider adding a comment explaining why this specific version range (4.46.3,<4.47) is required, especially since it needs to be compatible with tokenizers>=0.20,<0.21 used in CodeAnalysisTool.

@shreysingla11
Copy link
Collaborator Author

Code Review Summary

Changes Overview

  • Pinned transformers version to >=4.46.3,<4.47
  • Removed several dependencies from tox.ini

Concerns

  1. Dependency Management: The removal of core dependencies from tox.ini that are required by CodeAnalysisTool needs justification or alternative management strategy
  2. Version Compatibility: While pinning transformers version is good, need documentation on why this specific version range was chosen

Suggestions

  1. Document the rationale for dependency changes
  2. Add comments explaining version constraints
  3. Verify that all CodeAnalysisTool functionality works without the removed tox.ini dependencies

Rating: ⚠️ Needs Attention

The changes seem to address a specific issue but could potentially create problems if the removed dependencies aren't properly managed elsewhere. Please address the comments before merging.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 7011538 in 8 seconds

More details
  • Looked at 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. python/tox.ini:121
  • Draft comment:
    The removal of composio/ from the pytest command is not mentioned in the PR description. Ensure this change is intentional, as it may lead to untested code.
  • Reason this comment was not posted:
    Comment did not seem useful.

Workflow ID: wflow_llP2eLLJP5X6ZCNM


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant