-
Notifications
You must be signed in to change notification settings - Fork 5
feat: implement threading fallback with automatic client-side fallback #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implement graceful fallback from server-side to client-side threading when preconditions aren't met, ensuring conversations continue successfully. - Add response ID chain validation to detect broken threading chains - Enhance CallModelWithOpts with automatic fallback logic - Add structured logging for threading decisions and fallbacks - Implement comprehensive test coverage for all fallback scenarios - Handle edge cases and update documentation
|
This is neat. Lemme read it. My big concern here is that I think the CallWithOpts stuff is garbage and I kind of want to refactor it, but if fallback works cleanly maybe that's a lot easier (since threading is the only option I care about.) Thank you for this! |
|
I may snip the logging stuff; if we're going to slog from a client lib, probably want to do that through a non-default logger, and that starts to get tedious. |
|
Nice! Now I know a bit more about your intentions, which clears the decision tree greatly. About logging:Goal: Replace global default slog with injected, non-default logger. I can:
About "CallWithOpts is garbage":Goal: Extract and simplify first, before any API change. I can:
Short sequence:PR-1: Non-default logger injection (no behavior change). Add WithLogger, route logging through cw.logger. Plus a small test. This directly addresses both points you've mentioned. What do you think about it? |
|
Nah, just merge the PR I made on your PR. :) |
Cleanups for PR superfly#9
|
Merged. Appreciate the cleanup. |
Implement graceful fallback from server-side to client-side threading when preconditions aren't met, ensuring conversations continue successfully.
The proposed branch is essentially the final step in the
contextwindow/plans/2025-09-17-resume-context-plan.mdprocess, which implements a threading fallback feature.Currently,
CallModelWithOptsincontextwindow.goattempts server-side threading when enabled, but:This prevents seamless context resumption when:
Goals as per
2025-09-17-resume-context-plan.mddocumentCurrent Implementation Analysis
ContextWindow.CallModelWithOpts (contextwindow.go:422-509)
Current Flow:
Issues:
Model-Specific Validation (responses_model.go:338-381)
Existing:
canUseServerSideThreading()method validates:Gap: This validation happens inside the model implementation, not at the ContextWindow level.
Solution Design
Architecture Overview
Implementation removes only one block from
contextwindow.goinsideCallModelWithOptsfunction.Explanation of the change
Removed the error-returning block from
origin/mainto implement graceful fallback.Original behavior (origin/main)
New behavior (current implementation)
Replaced with:
Improvements:
shouldAttemptServerSideThreading()before attempting server-side threadingWhy this change was necessary
The original code blocked seamless context resumption when:
The new implementation ensures conversations continue successfully using client-side threading when server-side threading isn't available, which was the goal of the threading fallback feature.
The
go testresults are green, though you will see a lot of INFO messages. This is expected.INFO message structure
These INFO messages come from the
logThreadingDecisionfunction. They record threading decisions duringCallModelexecution.Message format
Field meanings
attempt_server_side: Whether server-side threading was attemptedtrue: Attempted (preconditions met)false: Using client-side threading (fallback or default)reason: Why this decision was madecontext: The context name where the decision occurredCommon reasons
"server-side threading not enabled for context"— Threading disabled for this context"no last_response_id available (first call or chain broken)"— First call or chain broken, using client-side"preconditions met"— Server-side threading attempted"server-side threading failed: <error>"— Server-side attempt failed, falling back"response_id chain invalid: <reason>"— Chain validation failed (e.g., tool calls present, LastResponseID mismatch)Example from test output
Meaning: Using client-side threading because there's no
LastResponseID(first call or chain broken) in contexttest-empty-threading.These logs help trace threading decisions, especially when fallback occurs, and can be used for debugging and monitoring in production.
There might also be some future enhancements if you are interested, like:
That's it. That seems like a lot of changes. Yet, the new tests add only ~1000 LOC, and everything is properly added and documented.
The project seems interesting to tackle. I hope you'll like the PR.