Skip to content

Conversation

@NaccOll
Copy link
Contributor

@NaccOll NaccOll commented Aug 8, 2025

Related GitHub Issue

Closes: #6359

Description

I found that the previous modification was still not perfect. There was still a problem that the checkpoint would be closed if it was triggered twice in a very short period of time.
By adjusting the service setting timing, the normal operation of the checkpoint is ensured


Important

Improves checkpoint service initialization in index.ts to prevent premature closure when triggered twice quickly.

  • Behavior:
    • Adjusts getCheckpointService in index.ts to prevent premature closure of checkpoint service when triggered twice quickly.
    • Uses pWaitFor to wait for checkpointService initialization if checkpointServiceInitializing is true.
    • Sets checkpointService only after successful initialization.
  • Error Handling:
    • On initialization failure, sets checkpointService to undefined and disables checkpoints.
    • Logs errors during initialization and Git checks.

This description was created by Ellipsis for 87c7456. You can customize this summary. It will automatically update as commits are pushed.

@NaccOll NaccOll requested review from cte, jr and mrubens as code owners August 8, 2025 19:18
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. bug Something isn't working labels Aug 8, 2025
@NaccOll
Copy link
Contributor Author

NaccOll commented Aug 8, 2025

@mrubens @daniel-lxs Please review this PR to make up for the shortcomings of the last one that removed the checkpoint save event during initialization, causing the problem that was originally fixed to reappear

We'll also need to consider adding a UI for the checkpoint wait timeout. The default is currently 15 seconds, but working on a codebase the size of RooCode, initialization take over 10 seconds.

If a user is working on a codebase significantly larger than RooCode and prefers to call tools to modify code in the first request, they'll be stuck waiting for 15 seconds without any notification, waiting for a checkpoint that will never arrive.

Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the changes and found some critical issues that need attention to fully resolve the checkpoint initialization problem.

try {
await checkGitInstallation(cline, service, log, provider)
cline.checkpointService = service
return service
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? The cline.checkpointServiceInitializing flag is never set to false in the success path here. It's only reset within checkGitInstallation (line 126 when the 'initialize' event fires) or in error cases.

This could cause issues if the 'initialize' event doesn't fire for some reason - subsequent calls would wait indefinitely at lines 63-69.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wait for the initialize event

// Clean up on failure
cline.checkpointServiceInitializing = false
cline.enableCheckpoints = false
cline.checkpointService = undefined
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good addition! Setting the service to undefined on failure ensures a clean state. However, this cleanup happens after the service was already assigned at line 75. Could there be a race condition where another call accesses the service between lines 75-86?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does exist, but there is only one other competitor at the same time, and they are waiting at pWaitFor

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Aug 8, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Prelim Review] in Roo Code Roadmap Aug 12, 2025
@hannesrudolph hannesrudolph added PR - Needs Preliminary Review and removed Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. labels Aug 12, 2025
@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Changes Requested] in Roo Code Roadmap Aug 12, 2025
@NaccOll
Copy link
Contributor Author

NaccOll commented Aug 12, 2025

@daniel-lxs

The purpose of assigning an uninitialized service is to ensure that pWait returns a result, even though it may be undefined (because initialization failed and the checkpoint function has been turned off), or it may not be initialized after the timeout, and the checkpoint function needs to be turned off.

The initial failure occurred because the Task called getCheckpointService(asnyc), which then called checkGitInstallation(asnyc), effectively being doubly asynchronous. However, if the tool triggered a file edit at this point, it would synchronously call checkpointSave. This caused the initialization to not complete and cline.checkpointService not yet set when checkpointSave was called, forcing a second call to initialize. The relevant code is as follows
src\core\checkpoints\index.ts:71-80

const service = RepoPerTaskCheckpointService.create(options)

cline.checkpointServiceInitializing = true

// Check if Git is installed before initializing the service
// Note: This is intentionally fire-and-forget to match the original IIFE pattern
// The service is returned immediately while Git check happens asynchronously
checkGitInstallation(cline, service, log, provider)

return service

However, due to the asynchronous operation, service.isInitialized is false, ultimately triggering the following logic, causing the checkpoint feature to be disabled for the current task.
src\core\checkpoints\index.ts:170-175

if (!service.isInitialized) {
const provider = cline.providerRef.deref()
provider?.log("[checkpointSave] checkpoints didn't initialize in time, disabling checkpoints for this task")
cline.enableCheckpoints = false
return
}

That is to say, the reason why the earliest checkpoint failed is that the checkpointSave method is wrong. When calling it alone, you must ensure that the initialization has been completed; otherwise, it will inevitably receive an incompletely initialized service.

But the reason is different now. Originally, I would call checkpointSave once after initialization, which would block and make the function work properly. However, considering the possibility of two checkpoints being close together, we removed it. This is where the new problem begins.

Now, if checkpointSave is called without initialization, cline.checkpointService will be undefined, so it will be initialized again. However, there's already an initialization task, which will cause a git command error, ultimately triggering the following code.

src\core\checkpoints\index.ts:92-95

// Clean up on failure
cline.checkpointServiceInitializing = false
cline.enableCheckpoints = false
throw err

This is the complete analysis of both errors.

The solution to the current problem is as I demonstrated in my PR. Initially, set cline.checkpointService and block by checking if checkpointServiceInitializing is true.

The logic for the first call to getCheckpointService is clear.

Set checkpointServiceInitializing to true and set checkpointService. At this point, isInitialized is false.

Subsequent calls have several scenarios.

  1. Initialization has completed, isInitialized = true, and the checkpoint is saved.
  2. Initialization has not yet completed, isInitialized = false, enter pWaitFor and wait for 15 seconds. If successful within 15 seconds, isInitialized becomes true, and the checkpoint is saved.
  3. Initialization has not yet completed, isInitialized = false, enter pWaitFor and wait for 15 seconds. If the timeout occurs, isInitialized = false, and the task disables checkpointing.
  4. Initialization fails, triggering the exception handling logic, and enableCheckpoints = false.

@NaccOll
Copy link
Contributor Author

NaccOll commented Aug 12, 2025

@daniel-lxs Of course, we can discuss not to hand over the checkpoint closing function to other functions, so that all checkpoint exceptions and processing are completed only in getCheckpointService

@NaccOll
Copy link
Contributor Author

NaccOll commented Aug 12, 2025

@daniel-lxs Please see the latest commit, I have simplified the process

@daniel-lxs daniel-lxs moved this from PR [Changes Requested] to PR [Needs Prelim Review] in Roo Code Roadmap Aug 12, 2025
Copy link
Member

@daniel-lxs daniel-lxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @NaccOll

@daniel-lxs daniel-lxs moved this from PR [Needs Prelim Review] to PR [Needs Review] in Roo Code Roadmap Aug 12, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 12, 2025
@mrubens mrubens merged commit 2730ef9 into RooCodeInc:main Aug 12, 2025
13 checks passed
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Aug 12, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer PR - Changes Requested size:M This PR changes 30-99 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

4 participants