Use Port of Context (pctx) for code mode#6765
Conversation
|
@eliasposen the core code looks good, but looks like there are a bunch of extra commits in the fork for some reason! if we can remove them then I can run the workflows and we can iterate on fixing the tests that have an issue |
Signed-off-by: Elias Posen <elias@posen.ch>
Signed-off-by: Elias Posen <elias@posen.ch>
|
@alexhancock was able to hard reset to most recent main & cherry pick my commits |
alexhancock
left a comment
There was a problem hiding this comment.
The new impl in the extension looks mostly good to me as a net-deletion and complexity reduction - asked a couple Qs though because I want to preserve ToolGraph
| "execute_code".to_string(), | ||
| "list_functions".to_string(), | ||
| indoc! {r#" | ||
| Batch multiple MCP tool calls into ONE execution. This is the primary purpose of this tool. |
There was a problem hiding this comment.
I'm curious if it still does a good job batching them without these kinds of instructions?
There was a problem hiding this comment.
I'm afraid I don't have any benchmarking on this yet but I have experienced batching during testing, there is still a batching instruction in the InitializeResult & moim of the extension, so I was just syncing these tool instructions to what exists currently in pctx
…ts with markdown Signed-off-by: Elias Posen <elias@posen.ch>
alexhancock
left a comment
There was a problem hiding this comment.
Code LGTM now. Will try it out now!
alexhancock
left a comment
There was a problem hiding this comment.
This LGTM
I like the diff in the code_execution_extension
303 insertions(+), 1094 deletions(-)
@michaelneale may also want to have a go
|
taking a look as this is right up my alley! |
crates/goose/src/agents/extension.rs
Outdated
| PlatformExtensionDef { | ||
| name: code_execution_extension::EXTENSION_NAME, | ||
| description: "Execute JavaScript code in a sandboxed environment", | ||
| description: "Execute TypeScript code in a sandboxed environment", |
There was a problem hiding this comment.
we can probably improve this description to explain what it is really for something like "Uses a sandbox to work with extensions"
|
ok this is fantastic, feels faster and better (no data- just feel!) so I really think this is the path forward @alexhancock and yeah - much less code to maintain. Some thoughts:
otherwise - if we can get this into a branch, updated and passing, I thin we should go with this. Saves a whole lot of fiddly code for a library. really great stuff @eliasposen |
|
Thank you @alexhancock & @michaelneale for your reviews! In response to the points above:
|
|
@eliasposen is there any plan to allow for other runtimes with port of context? It seems like a clear improvement for goose to adopt pctx's mcp-to-ts bridge, but the switch from boa to deno seems to increase the goose binary by quite a lot (113mb -> 190mb on my system). Not a deal-breaker probably, but something to consider. |
|
@jamadeo We are considering it. These were our main considerations when choosing deno:
|
Signed-off-by: Elias Posen <elias@posen.ch>
|
ok this is looking good. Apologise for the confusing report, but you can probably trust me that I am running various tests with open models of various sizes (and of course frontier) and pctx seems to consistently outperform (my suite here: https://github.com/michaelneale/open-model-gym - but ignore that for now) "release" -- pctx, "default" -- boa
more correct and/or faster, so this is a big yes for me! lets go! |
|
I will fix the ACP thing. the fixtures were supposed to give a good error on mismatch, not panik. this is a bug I can sort out here. |
|
My 2 cents. I did few rounds of testing with vllm/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16 to see how this works with smaller models. A. code_execution with this PR (pctx) It keeps calling B. Existing boa implementation with the improvement PR #6497 |
|
@eliasposen I pushed a copy of this branch to https://github.com/block/goose/compare/pctx and this includes the test expectation fix and also some polish to the fixtures which hid the underlying mismatch. code mode is a little different. those fixes are in this commit 9002a2e0bc17214311d9238b225fa2c181fb47aa So, you should do this: git fetch https://github.com/block/goose.git pctx
git cherry-pick 9002a2e0bc17214311d9238b225fa2c181fb47aaThen rebase on latest block/goose main after you do above. Hope this helps! |
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Elias Posen <elias@posen.ch>
Signed-off-by: Elias Posen <elias@posen.ch> Signed-off-by: Adrian Cole <adrian@tetrate.io> Co-authored-by: Adrian Cole <adrian@tetrate.io> Signed-off-by: Harrison <hcstebbins@gmail.com>
Signed-off-by: Elias Posen <elias@posen.ch> Signed-off-by: Adrian Cole <adrian@tetrate.io> Co-authored-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Elias Posen <elias@posen.ch> Signed-off-by: Adrian Cole <adrian@tetrate.io> Co-authored-by: Adrian Cole <adrian@tetrate.io>
Signed-off-by: Elias Posen <elias@posen.ch> Signed-off-by: Adrian Cole <adrian@tetrate.io> Co-authored-by: Adrian Cole <adrian@tetrate.io>

Summary
This change replaces
boawithpctxfor CodeMode.pctxuses a custom deno runtime for type-checking and code execution. It also comes with type generation, mcp registration, and rust callbacks out of the box.Some thoughts for further development:
outputSchemafor many MCPs.pctxis currently exploring ways of mutating & caching a tool's output schema as it is used so the generated CodeMode interface can get better over time.Type of Change
AI Assistance
Testing
test_providers.shRelated Issues
Relates to #ISSUE_ID
Discussion: Discord
Screenshots/Demos (for UX changes)
Before:
After: