-
Notifications
You must be signed in to change notification settings - Fork 2.7k
fix: improve checkpoint service initialization handling #6860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@mrubens @daniel-lxs Please review this PR to make up for the shortcomings of the last one that removed the checkpoint save event during initialization, causing the problem that was originally fixed to reappear We'll also need to consider adding a UI for the checkpoint wait timeout. The default is currently 15 seconds, but working on a codebase the size of RooCode, initialization take over 10 seconds. If a user is working on a codebase significantly larger than RooCode and prefers to call tools to modify code in the first request, they'll be stuck waiting for 15 seconds without any notification, waiting for a checkpoint that will never arrive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the changes and found some critical issues that need attention to fully resolve the checkpoint initialization problem.
src/core/checkpoints/index.ts
Outdated
| try { | ||
| await checkGitInstallation(cline, service, log, provider) | ||
| cline.checkpointService = service | ||
| return service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intentional? The cline.checkpointServiceInitializing flag is never set to false in the success path here. It's only reset within checkGitInstallation (line 126 when the 'initialize' event fires) or in error cases.
This could cause issues if the 'initialize' event doesn't fire for some reason - subsequent calls would wait indefinitely at lines 63-69.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wait for the initialize event
src/core/checkpoints/index.ts
Outdated
| // Clean up on failure | ||
| cline.checkpointServiceInitializing = false | ||
| cline.enableCheckpoints = false | ||
| cline.checkpointService = undefined |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good addition! Setting the service to undefined on failure ensures a clean state. However, this cleanup happens after the service was already assigned at line 75. Could there be a race condition where another call accesses the service between lines 75-86?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does exist, but there is only one other competitor at the same time, and they are waiting at pWaitFor
|
The purpose of assigning an uninitialized service is to ensure that pWait returns a result, even though it may be undefined (because initialization failed and the checkpoint function has been turned off), or it may not be initialized after the timeout, and the checkpoint function needs to be turned off. The initial failure occurred because the Task called getCheckpointService(asnyc), which then called checkGitInstallation(asnyc), effectively being doubly asynchronous. However, if the tool triggered a file edit at this point, it would synchronously call checkpointSave. This caused the initialization to not complete and cline.checkpointService not yet set when checkpointSave was called, forcing a second call to initialize. The relevant code is as follows const service = RepoPerTaskCheckpointService.create(options)
cline.checkpointServiceInitializing = true
// Check if Git is installed before initializing the service
// Note: This is intentionally fire-and-forget to match the original IIFE pattern
// The service is returned immediately while Git check happens asynchronously
checkGitInstallation(cline, service, log, provider)
return serviceHowever, due to the asynchronous operation, service.isInitialized is false, ultimately triggering the following logic, causing the checkpoint feature to be disabled for the current task. if (!service.isInitialized) {
const provider = cline.providerRef.deref()
provider?.log("[checkpointSave] checkpoints didn't initialize in time, disabling checkpoints for this task")
cline.enableCheckpoints = false
return
}That is to say, the reason why the earliest checkpoint failed is that the checkpointSave method is wrong. When calling it alone, you must ensure that the initialization has been completed; otherwise, it will inevitably receive an incompletely initialized service. But the reason is different now. Originally, I would call checkpointSave once after initialization, which would block and make the function work properly. However, considering the possibility of two checkpoints being close together, we removed it. This is where the new problem begins. Now, if checkpointSave is called without initialization, cline.checkpointService will be undefined, so it will be initialized again. However, there's already an initialization task, which will cause a git command error, ultimately triggering the following code. src\core\checkpoints\index.ts:92-95 // Clean up on failure
cline.checkpointServiceInitializing = false
cline.enableCheckpoints = false
throw errThis is the complete analysis of both errors. The solution to the current problem is as I demonstrated in my PR. Initially, set cline.checkpointService and block by checking if checkpointServiceInitializing is true. The logic for the first call to getCheckpointService is clear. Set checkpointServiceInitializing to true and set checkpointService. At this point, isInitialized is false. Subsequent calls have several scenarios.
|
|
@daniel-lxs Of course, we can discuss not to hand over the checkpoint closing function to other functions, so that all checkpoint exceptions and processing are completed only in getCheckpointService |
|
@daniel-lxs Please see the latest commit, I have simplified the process |
daniel-lxs
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @NaccOll
Related GitHub Issue
Closes: #6359
Description
I found that the previous modification was still not perfect. There was still a problem that the checkpoint would be closed if it was triggered twice in a very short period of time.
By adjusting the service setting timing, the normal operation of the checkpoint is ensured
Important
Improves checkpoint service initialization in
index.tsto prevent premature closure when triggered twice quickly.getCheckpointServiceinindex.tsto prevent premature closure of checkpoint service when triggered twice quickly.pWaitForto wait forcheckpointServiceinitialization ifcheckpointServiceInitializingis true.checkpointServiceonly after successful initialization.checkpointServicetoundefinedand disables checkpoints.This description was created by
for 87c7456. You can customize this summary. It will automatically update as commits are pushed.