Conversation
|
@jh-block and @blackgirlbytes, I’ve created a fresh PR. Thanks! |
|
Thanks! The implementation looks good. I have a couple of questions:
|
|
Thanks for your feedback For question 1: Current state: Cache points are added at the message level (after all content blocks), regardless of whether the message contains a ToolRequest or a ToolResponse. For question 2: This means you can only cache 3 messages in addition to the system message. A better approach might be to cache the earliest messages instead of the most recent ones, since early messages tend to remain stable, while recent messages change on every turn and frequently invalidate the cache. Tell me what you think |
ba59c47 to
b0a9321
Compare
|
@jh-block and @blackgirlbytes, I’ve applied the requested changes |
b0a9321 to
c2561e2
Compare
|
Hi @michaelneale, when you have a moment, could you kindly review this PR? Thank you.” |
c2561e2 to
c787ead
Compare
|
Hi @jh-block , |
|
Thanks for the changes. I have a couple comments:
|
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
Signed-off-by: fbalicchia <fbalicchia@cuebiq.com>
c787ead to
47dc354
Compare
8db3c7c to
f7e8ca5
Compare
|
Hey @fbalicchia, let me know when this is ready for review, I have a few small fixes but I can just push them up if that's OK with you, so we can get this merged sooner |
|
@jh-block go ahead thanks |
The method is on &self and can access self.model.model_name directly. All call sites were passing &self.model.model_name anyway.
Keep only the 'why' rationale for cache strategy, remove restated code.
…ith_caching The caller already filters to visible messages, so the inner filtering was redundant.
Merge test_to_bedrock_message_with_caching, _multiple_content, and _preserves_order into a single test. Remove test_message_conversion_with_cache_points from bedrock.rs as it duplicated format-level tests. Fix fast_model field name.
…oviders * 'main' of github.com:block/goose: New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) # Conflicts: # ui/desktop/src/components/settings/dictation/DictationSettings.tsx
…xt-edit * origin/main: (35 commits) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) ...
* main: Simplified custom model flow with canonical models (#6934) feat: simplify the text editor to be more like pi (#7426) docs: add YouTube short embed to Neighborhood extension tutorial (#7456) fix: flake.nix build failure and deprecation warning (#7408) feat(claude-code): add permission prompt routing for approve mode (#7420) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419)
* main: (171 commits) fix: TLDR CLI tab in Neighborhood MCP docs (#7461) fix(summon): restore skill supporting files and directory path in load output (#7457) Simplified custom model flow with canonical models (#6934) feat: simplify the text editor to be more like pi (#7426) docs: add YouTube short embed to Neighborhood extension tutorial (#7456) fix: flake.nix build failure and deprecation warning (#7408) feat(claude-code): add permission prompt routing for approve mode (#7420) docs: generate manpages (#7443) Blog/goose v1 25 0 release (#7433) fix: detect truncated LLM responses in apps extension (#7354) fix: removed unnecessary version for goose acp macro dependency (#7428) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) ...
* origin/main: (49 commits) add flag to hide select voice providers (#7406) New navigation settings layout options and styling (#6645) refactor: MCP-compliant theme tokens and CSS class rename (#7275) Redirect llama.cpp logs through tracing to avoid polluting CLI stdout/stderr (#7434) refactor: change open recipe in new window to pass recipe id (#7392) fix: handle truncated tool calls that break conversation alternation (#7424) streamline some github actions (#7430) Enable bedrock prompt cache (#6710) fix: use BEGIN IMMEDIATE to prevent SQLite deadlocks (#7429) Display working dir (#7419) dev: add cmake to hermitized env (#7399) refactor: remove allows_unlisted_models flag, always allow custom model entry (#7255) feat: expose context window utilization to agent via MOIM (#7418) Small model naming (#7394) chore(deps): bump ajv in /documentation (#7416) doc: groq models (#7404) Client settings (#7381) Fix settings tabs getting cut off in narrow windows (#7379) docs: voice dictation updates (#7396) [docs] Add Excalidraw MCP App Tutorial (#7401) ... # Conflicts: # ui/desktop/src/components/McpApps/McpAppRenderer.tsx
Summary
Implemented prompt caching for Anthropic Claude models on AWS Bedrock to reduce costs
Introduced an intelligent cache point placement strategy that complies with AWS Bedrock’s four cache point limitation.
Added the BEDROCK_ENABLE_CACHING configuration parameter
Testing