Skip to content

Conversation

@Ol1ver0413
Copy link
Collaborator

@Ol1ver0413 Ol1ver0413 commented Dec 7, 2025

Description

PR Description

This PR adds a metadata-based toolkit selection mechanism inspired by Anthropic’s Skills.
Instead of loading full tool schemas at the beginning, the agent now first receives lightweight toolkit metadata, selects the needed tools, and then dynamically loads only the relevant tool definitions. #3445

Benefits

  • Significantly reduces token usage
  • More accurate tool selection
  • Works well for both small and large toolkits
  • Simple and extensible architecture

Token Calculation

Total Tokens = (Stage 1 Input + Stage 1 Output) + (Stage 2 Input + Stage 2 Output)

Where:

Stage 1 uses system + user message + toolkit metadata
Stage 2 loads full schema only for selected toolkits

Initial testing shows clear token savings and improved execution precision.

Checklist

Go over all the following points, and put an x in all the boxes that apply.

  • I have read the CONTRIBUTION guide (required)
  • I have linked this PR to an issue using the Development section on the right sidebar or by adding Fixes #issue-number in the PR description (required)
  • I have checked if any dependencies need to be added or updated in pyproject.toml and uv lock
  • I have updated the tests accordingly (required for a bug fix or a new feature)
  • I have updated the documentation if needed:
  • I have added examples if this is a new feature

If you are unsure about any of these, don't hesitate to ask. We are here to help!

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 7, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch pre-research_skills

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added the Review Required PR need to be reviewed label Dec 7, 2025
@Ol1ver0413
Copy link
Collaborator Author

Ol1ver0413 commented Dec 7, 2025

Hey, @fengju0213 ! I built a demo to evaluate token consumption by applying the skill-based concept from Anthropic’s Skills repository.
I observed notable token savings and improved toolkit-selection accuracy. However, agents with all toolkits preloaded tend to produce shorter, lower-token answers and — as a result — indirectly favor toolkits that themselves emit fewer tokens. For example, in this demo (where the task is merely a brief introduction), the skill-based agent sometimes reads and interprets the full paper content, while the fully-loaded agent typically only uses the download tool to grab basic metadata.

@fengju0213
Copy link
Collaborator

Hey, @fengju0213 ! I built a demo to evaluate token consumption by applying the skill-based concept from Anthropic’s Skills repository. I observed notable token savings and improved toolkit-selection accuracy. However, agents with all toolkits preloaded tend to produce shorter, lower-token answers and — as a result — indirectly favor toolkits that themselves emit fewer tokens. For example, in this demo (where the task is merely a brief introduction), the skill-based agent sometimes reads and interprets the full paper content, while the fully-loaded agent typically only uses the download tool to grab basic metadata.

very interesting!I’ll have community members test it together and iteratively make it better.

@fengju0213 fengju0213 added this to the Sprint 44 milestone Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Review Required PR need to be reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants