Skip to content

Conversation

@bogwi
Copy link

@bogwi bogwi commented Nov 2, 2025

Implement graceful fallback from server-side to client-side threading when preconditions aren't met, ensuring conversations continue successfully.

  • Add response ID chain validation to detect broken threading chains
  • Enhance CallModelWithOpts with automatic fallback logic
  • Add structured logging for threading decisions and fallbacks
  • Implement comprehensive test coverage for all fallback scenarios
  • Handle edge cases and update documentation

The proposed branch is essentially the final step in the contextwindow/plans/2025-09-17-resume-context-plan.md process, which implements a threading fallback feature.

Currently, CallModelWithOpts in contextwindow.go attempts server-side threading when enabled, but:

  1. Returns an error if server-side threading fails (line 466)
  2. Returns an error if model doesn't support threading (line 469)
  3. Does not gracefully fall back to client-side threading when response_id chain is broken
  4. Does not validate response_id chain integrity before attempting server-side threading

This prevents seamless context resumption when:

  • A context was created with server-side threading but later resumed without a valid response_id
  • The response_id chain was broken (e.g., context exported/imported, records manually edited)
  • The model's threading call fails but client-side threading would work

Goals as per 2025-09-17-resume-context-plan.md document

  1. Graceful Fallback: Automatically fall back to client-side threading when server-side threading cannot be used
  2. Chain Validation: Validate response_id chain integrity before attempting server-side threading
  3. Error Recovery: Handle threading failures gracefully without breaking the conversation
  4. Observability: Add structured logging for threading decisions and fallbacks
  5. Backward Compatibility: Maintain existing behavior for valid threading scenarios

Current Implementation Analysis

ContextWindow.CallModelWithOpts (contextwindow.go:422-509)

Current Flow:

1. Get context info (includes UseServerSideThreading flag and LastResponseID)
2. Load live records
3. If UseServerSideThreading:
   a. Check if model supports threading
   b. Call model with threading
   c. If error → return error (NO FALLBACK)
4. Else: Use client-side threading

Issues:

  • No validation of LastResponseID before use
  • No fallback on threading errors
  • Error at line 469 if model doesn't support threading (should fall back)
  • Error at line 466 if threading call fails (should fall back)

Model-Specific Validation (responses_model.go:338-381)

Existing: canUseServerSideThreading() method validates:

  • Presence of tool calls (disables threading)
  • Missing response IDs in model responses
  • Mixed response ID states

Gap: This validation happens inside the model implementation, not at the ContextWindow level.

Solution Design

Architecture Overview

CallModelWithOpts
  ├─> validateThreadingPreconditions()  [NEW]
  │     ├─> Check if model supports threading
  │     ├─> Validate response_id chain integrity
  │     └─> Return: (canUseServerSide, reason)
  │
  ├─> Try server-side threading (if validated)
  │     └─> On error → log + fall back
  │
  └─> Use client-side threading (fallback or default)

Implementation removes only one block from contextwindow.go inside CallModelWithOpts function.

Explanation of the change

Removed the error-returning block from origin/main to implement graceful fallback.

Original behavior (origin/main)

if contextInfo.UseServerSideThreading {
    if threadingModel, ok := cw.model.(ServerSideThreadingCapable); ok {
        // ... attempt threading ...
        if err != nil {
            return "", fmt.Errorf("call model with threading: %w", err)  // ERROR - stops execution
        }
    } else {
        return "", fmt.Errorf("model does not support server-side threading")  // ERROR - stops execution
    }
} else {
    // client-side threading
}

New behavior (current implementation)

Replaced with:

// 1. Validate preconditions FIRST (new)
attemptServerSide, reason := cw.shouldAttemptServerSideThreading(contextInfo, recs)

if attemptServerSide {
    // 2. Attempt server-side threading
    // ... threading code ...
    
    if err != nil {
        // 3. FALLBACK instead of error
        // Log fallback and continue to client-side threading
        loggedFallback = true
        attemptServerSide = false
    }
}

// 4. Use client-side threading (either as fallback or default)
if !attemptServerSide {
    // Client-side threading always works
}

Improvements:

  1. Precondition validation via shouldAttemptServerSideThreading() before attempting server-side threading
  2. Graceful fallback: on threading failure, fall back to client-side instead of returning an error
  3. Model compatibility: if the model doesn't support threading, use client-side instead of erroring
  4. Observability: logging for threading decisions and fallbacks

Why this change was necessary

The original code blocked seamless context resumption when:

  • The response_id chain was broken
  • Server-side threading failed (API errors, network issues, etc.)
  • The model didn't support threading

The new implementation ensures conversations continue successfully using client-side threading when server-side threading isn't available, which was the goal of the threading fallback feature.


The go test results are green, though you will see a lot of INFO messages. This is expected.

INFO message structure

These INFO messages come from the logThreadingDecision function. They record threading decisions during CallModel execution.

Message format

INFO threading decision attempt_server_side=<bool> reason="<reason>" context=<context_name>

Field meanings

  1. attempt_server_side: Whether server-side threading was attempted

    • true: Attempted (preconditions met)
    • false: Using client-side threading (fallback or default)
  2. reason: Why this decision was made

  3. context: The context name where the decision occurred

Common reasons

  • "server-side threading not enabled for context" — Threading disabled for this context
  • "no last_response_id available (first call or chain broken)" — First call or chain broken, using client-side
  • "preconditions met" — Server-side threading attempted
  • "server-side threading failed: <error>" — Server-side attempt failed, falling back
  • "response_id chain invalid: <reason>" — Chain validation failed (e.g., tool calls present, LastResponseID mismatch)

Example from test output

INFO threading decision attempt_server_side=false reason="no last_response_id available (first call or chain broken)" context=test-empty-threading

Meaning: Using client-side threading because there's no LastResponseID (first call or chain broken) in context test-empty-threading.

These logs help trace threading decisions, especially when fallback occurs, and can be used for debugging and monitoring in production.


There might also be some future enhancements if you are interested, like:

  1. Automatic Chain Repair: Detect and fix broken chains automatically
  2. Metrics: Track fallback rates for monitoring
  3. Configuration: Allow users to disable fallback (strict mode)
  4. Advanced Validation: Validate response_id format/validity with API
  5. Caching: Cache validation results for performance

That's it. That seems like a lot of changes. Yet, the new tests add only ~1000 LOC, and everything is properly added and documented.

The project seems interesting to tackle. I hope you'll like the PR.

Implement graceful fallback from server-side to client-side threading when
preconditions aren't met, ensuring conversations continue successfully.

- Add response ID chain validation to detect broken threading chains
- Enhance CallModelWithOpts with automatic fallback logic
- Add structured logging for threading decisions and fallbacks
- Implement comprehensive test coverage for all fallback scenarios
- Handle edge cases and update documentation
@tqbf
Copy link
Contributor

tqbf commented Nov 11, 2025

This is neat. Lemme read it. My big concern here is that I think the CallWithOpts stuff is garbage and I kind of want to refactor it, but if fallback works cleanly maybe that's a lot easier (since threading is the only option I care about.)

Thank you for this!

@tqbf
Copy link
Contributor

tqbf commented Nov 11, 2025

I may snip the logging stuff; if we're going to slog from a client lib, probably want to do that through a non-default logger, and that starts to get tedious.

don't log, pass context not ID if we're just immediately look it up
from ID anyways, lose obvious comments.

tested with local agent, seems to work peachy
@bogwi
Copy link
Author

bogwi commented Nov 12, 2025

Nice! Now I know a bit more about your intentions, which clears the decision tree greatly.

About logging:

Goal: Replace global default slog with injected, non-default logger.

I can:

  • Add WithLogger(slog.Logger) option to NewContextWindow (functional options) and a SetLogger(*slog.Logger) method.
  • Default to a no-op logger so libraries don’t emit logs unless configured.
  • Route existing slog.Info calls through cw.logger.
  • Add a tiny test to ensure no logs by default and logs appear when configured.
  • Update docs about threading/fallback behavior and when fallback triggers, and how to enable logging and what keys are emitted, - all that.

About "CallWithOpts is garbage":

Goal: Extract and simplify first, before any API change.

I can:

  • Move threading decision logic into a focused, exported helper (already close: shouldAttemptServerSideThreading, ValidateResponseIDChain).
  • Add brief docstrings for the decision function: inputs, outputs, invariants.
  • Result: same behavior, cleaner separation, easier to reason about.

Short sequence:

PR-1: Non-default logger injection (no behavior change). Add WithLogger, route logging through cw.logger. Plus a small test.
PR-2: Docs: threading + logging usage. One Example test
PR-3: Extract/refine threading decision helpers; keep API intact. Tiny pure-function tests for decision logic.
RFC Issue: “Simplify Call API (replace CallWithOpts/CallWithThreading* with functional options).”
PR-4: Implement API simplification per RFC; keep shims for compatibility; update tests.
PR-5: CI ?


This directly addresses both points you've mentioned. What do you think about it?

@tqbf
Copy link
Contributor

tqbf commented Nov 12, 2025

Nah, just merge the PR I made on your PR. :)

@bogwi
Copy link
Author

bogwi commented Nov 12, 2025

Merged. Appreciate the cleanup.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants