Improve checkpoint reliability and cleanup of temp files #1367

nicktrn · 2024-09-27T14:33:46Z

Coordinator changes:

replace execa with tinyexec
robust abstractions for calling binaries
better cleanup of checkpoint artifacts
delete other temporary files on an interval

Summary by CodeRabbit

New Features
- Introduced TempFileCleaner for managing temporary file deletions.
- Added utility functions for handling environment variables.
Improvements
- Simplified ChaosMonkey functionality by removing unnecessary parameters.
- Enhanced Checkpointer class with improved error handling and modular design.
- Established structured command execution and container management with new classes.
Bug Fixes
- Streamlined error handling in various classes for better resilience.
Chores
- Updated dependencies by removing execa and adding tinyexec.

changeset-bot · 2024-09-27T14:33:50Z

⚠️ No Changeset found

Latest commit: 57c1f5f

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2024-09-27T14:36:48Z

Walkthrough

The pull request introduces significant changes across multiple files in the apps/coordinator module. Key modifications include the removal of the execa dependency and its replacement with tinyexec, updates to the ChaosMonkey class to simplify command execution, and enhancements to the Checkpointer class for improved error handling and modularization. Additionally, new utility functions for environment variable management are introduced in util.ts, while a new TempFileCleaner class is added to manage temporary file deletions.

Changes

File Path	Change Summary
`apps/coordinator/package.json`	Removed dependency `execa` version `^8.0.1`; added dependency `tinyexec` version `^0.3.0`.
`apps/coordinator/src/chaosMonkey.ts`	Updated `call` method in `ChaosMonkey` class by removing optional parameter `$`, simplifying command execution logic, and streamlining error handling.
`apps/coordinator/src/checkpointer.ts`	Removed `CheckpointAbortError` class; refactored error handling; added conditional `TempFileCleaner` initialization; updated `init` method for login checks; modularized checkpointing into `#createDockerCheckpoint` method.
`apps/coordinator/src/cleaner.ts`	Introduced `TempFileCleaner` class for managing temporary file deletions, including methods for starting/stopping the cleaning process and logging operations.
`apps/coordinator/src/exec.ts`	Added `Exec`, `Buildah`, and `Crictl` classes for command execution and container management, with various methods for handling container operations and logging.
`apps/coordinator/src/index.ts`	Removed utility functions `boolFromEnv` and `numFromEnv`, replacing them with imports from `./util`.
`apps/coordinator/src/util.ts`	Introduced utility functions `boolFromEnv` and `numFromEnv` for safely retrieving boolean and numeric values from environment variables.

Possibly related PRs

Prevent abort signals from causing uncaught exceptions #1320: The changes in this PR involve error handling within the Checkpointer class, which is relevant to the modifications made in the main PR regarding the ChaosMonkey class's command execution logic, as both involve error management and handling of asynchronous operations.
Prevent crashes on expected checkpoint cancellations #1324: This PR enhances error handling in the TaskCoordinator class, which may relate to the overall error handling improvements seen in the main PR, particularly in the context of managing dependencies and execution flow.

🐇 In the code's cozy burrow,
Dependencies shed, new paths to follow.
ChaosMonkey hops with less to stress,
Cleaners tidy up, leaving no mess.
With utilities bright, the code shines anew,
A rabbit's cheer for changes so true! 🌟

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

pkg-pr-new · 2024-09-27T14:39:24Z

pnpm add https://pkg.pr.new/triggerdotdev/trigger.dev/@trigger.dev/build@1367

pnpm add https://pkg.pr.new/triggerdotdev/trigger.dev/@trigger.dev/core@1367

pnpm add https://pkg.pr.new/triggerdotdev/trigger.dev@1367

pnpm add https://pkg.pr.new/triggerdotdev/trigger.dev/@trigger.dev/sdk@1367

commit: 57c1f5f

coderabbitai

Actionable comments posted: 16

🧹 Outside diff range and nitpick comments (8)

apps/coordinator/src/util.ts (1)

1-19: Overall assessment of util.ts

The introduction of boolFromEnv and numFromEnv utility functions is a positive step towards more consistent and safer handling of environment variables. These functions provide a good foundation for converting environment variables to boolean and numeric values, respectively.

However, as noted in the individual function reviews, there's room for improvement in terms of robustness and versatility. Implementing the suggested enhancements will make these utilities more reliable and flexible, handling a wider range of input scenarios and providing better error handling.

To maximize the benefits of these new utilities:

Consider implementing the suggested improvements for both functions.

Use the verification script provided to identify areas in the codebase where these functions can replace direct process.env access.

Document these utilities and encourage their use across the project for consistent environment variable handling.

As the project grows, consider expanding this utility file with other commonly used functions for environment variable handling or configuration management. This can help maintain a centralized and consistent approach to dealing with environment-dependent values throughout the application.

apps/coordinator/package.json (1)

22-23: Update project documentation to reflect dependency changes.

Consider updating the project's README or documentation to reflect the replacement of execa with tinyexec. This will help other developers understand the change and its implications for the project.
apps/coordinator/src/chaosMonkey.ts (1)
60-60: LGTM: Simplified implementation improves code clarity.

The changes in the call method implementation are well-aligned with the PR objectives:

Direct use of the timeout function simplifies the code and removes the dependency on execa.

Consistent use of ChaosMonkey.Error improves error handling uniformity.

These changes contribute to a more straightforward and maintainable implementation.

Consider adding a brief comment explaining the purpose of the timeout function call, as it might not be immediately clear to new developers why a delay is being introduced. For example:
// Introduce artificial delay to simulate network latency or processing time
await timeout(this.delayInSeconds * 1000);
Also applies to: 68-68
apps/coordinator/src/index.ts (4)
Line range hint 93-101: Consider improving error handling in #returnValidatedExtraHeaders

The current implementation throws a generic Error when an extra header is undefined. Consider using a more specific error type or providing more context in the error message. This would make it easier to handle and debug issues related to header validation.

Here's a suggested improvement:
 #returnValidatedExtraHeaders(headers: Record<string, string>) {
   for (const [key, value] of Object.entries(headers)) {
     if (value === undefined) {
-      throw new Error(`Extra header is undefined: ${key}`);
+      throw new HeaderValidationError(`Extra header '${key}' is undefined`);
     }
   }

   return headers;
 }

+class HeaderValidationError extends Error {
+  constructor(message: string) {
+    super(message);
+    this.name = 'HeaderValidationError';
+  }
+}
This change introduces a custom error type, making it easier to catch and handle specific header validation errors throughout the application.

Line range hint 359-397: Improve error handling in READY_FOR_LAZY_ATTEMPT handler

The current implementation catches all errors and only logs them before crashing the run. This approach might hide important errors or make debugging more difficult. Consider differentiating between expected and unexpected errors, and handle them accordingly.

Here's a suggested improvement:
 socket.on("READY_FOR_LAZY_ATTEMPT", async (message) => {
   logger.log("[READY_FOR_LAZY_ATTEMPT]", message);

   try {
     const lazyAttempt = await this.#platformSocket?.sendWithAck("READY_FOR_LAZY_ATTEMPT", {
       ...message,
       envId: socket.data.envId,
     });

     if (!lazyAttempt) {
       throw new Error("No lazy attempt ack");
     }

     if (!lazyAttempt.success) {
       throw new Error("Failed to get lazy attempt payload");
     }

     await chaosMonkey.call();

     socket.emit("EXECUTE_TASK_RUN_LAZY_ATTEMPT", {
       version: "v1",
       lazyPayload: lazyAttempt.lazyPayload,
     });
   } catch (error) {
     if (error instanceof ChaosMonkey.Error) {
       logger.error("ChaosMonkey error, won't crash run", { runId: socket.data.runId });
       return;
     }

-    logger.error("Error", { error });
-
-    await crashRun({
-      name: "ReadyForLazyAttemptError",
-      message:
-        error instanceof Error ? `Unexpected error: ${error.message}` : "Unexpected error",
-    });
+    if (error instanceof Error) {
+      logger.error("ReadyForLazyAttemptError", { error: error.message, stack: error.stack });
+      await crashRun({
+        name: "ReadyForLazyAttemptError",
+        message: error.message,
+        stack: error.stack,
+      });
+    } else {
+      logger.error("Unknown ReadyForLazyAttemptError", { error });
+      await crashRun({
+        name: "UnknownReadyForLazyAttemptError",
+        message: "An unknown error occurred",
+      });
+    }

     return;
   }
 });
This change provides more detailed error logging and differentiates between Error instances and other types of errors, which can help in debugging and error tracking.

Line range hint 1008-1022: Enhance logging in #cancelCheckpoint method

The #cancelCheckpoint method currently logs only the final result. Consider adding more detailed logging to track the method's execution flow, which could be helpful for debugging and monitoring.

Here's a suggested improvement:
 #cancelCheckpoint(runId: string): boolean {
+  logger.debug("Attempting to cancel checkpoint", { runId });
   const checkpointWait = this.#checkpointableTasks.get(runId);

   if (checkpointWait) {
+    logger.debug("Found checkpoint wait, rejecting", { runId });
     // Stop waiting for task to reach checkpointable state
     checkpointWait.reject(new CheckpointCancelError());
+  } else {
+    logger.debug("No checkpoint wait found", { runId });
   }

   // Cancel checkpointing procedure
   const checkpointCanceled = this.#checkpointer.cancelCheckpoint(runId);

-  logger.log("cancelCheckpoint()", { runId, checkpointCanceled });
+  logger.log("Checkpoint cancellation result", { runId, checkpointCanceled });

   return checkpointCanceled;
 }
This change adds more detailed logging throughout the method, which can help track the execution flow and make debugging easier.

Line range hint 1039-1043: Add authentication to /checkpoint endpoint

The /checkpoint endpoint currently lacks authentication, which could pose a security risk. Consider adding an authentication mechanism to ensure that only authorized clients can trigger checkpoints.

Here's a suggested improvement:
 case "/checkpoint": {
   const body = await getTextBody(req);
+  const authToken = req.headers['authorization'];
+  
+  if (!authToken || !this.#isValidAuthToken(authToken)) {
+    return reply.empty(401);
+  }
+  
   // await this.#checkpointer.checkpointAndPush(body);
   return reply.text(`sent restore request: ${body}`);
 }

+ #isValidAuthToken(token: string): boolean {
+   // Implement your authentication logic here
+   // For example, compare against a stored secret or validate with an auth service
+   return token === process.env.CHECKPOINT_SECRET;
+ }
This change adds a basic authentication check using an Authorization header. Make sure to implement proper token validation in the #isValidAuthToken method and securely manage the CHECKPOINT_SECRET.
apps/coordinator/src/cleaner.ts (1)
99-101: Rename the x getter for better readability

The getter x returns a bound function for executing commands, but the single-letter name x is not descriptive.

Consider renaming it to improve code clarity:
- private get x() {
+ private get execCommand() {
    return this.exec.x.bind(this.exec);
  }
Update all references from this.x to this.execCommand:
- const du = this.x("find", [...baseArgs, ...duArgs]);
+ const du = this.execCommand("find", [...baseArgs, ...duArgs]);

- const rm = this.x("find", [...baseArgs, ...rmArgs]);
+ const rm = this.execCommand("find", [...baseArgs, ...rmArgs]);
This change enhances readability and makes the purpose of the method clearer.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between f154fcc and 9437c7e.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (7)

apps/coordinator/package.json (1 hunks)
apps/coordinator/src/chaosMonkey.ts (1 hunks)
apps/coordinator/src/checkpointer.ts (12 hunks)
apps/coordinator/src/cleaner.ts (1 hunks)
apps/coordinator/src/exec.ts (1 hunks)
apps/coordinator/src/index.ts (1 hunks)
apps/coordinator/src/util.ts (1 hunks)

🧰 Additional context used

🪛 Biome

apps/coordinator/src/checkpointer.ts

[error] 583-583: Unsafe usage of 'return'.

'return' in 'finally' overwrites the control flow statements inside 'try' and 'catch'.

(lint/correctness/noUnsafeFinally)

apps/coordinator/src/exec.ts

[error] 276-276: Using this in a static context can be confusing.

this refers to the class.
Unsafe fix: Use the class name instead.

(lint/complexity/noThisInStatic)

🔇 Additional comments (7)

apps/coordinator/package.json (1)
22-23: LGTM! Dependency changes look good.

The replacement of execa with tinyexec aligns with the PR objectives to streamline binary execution. The socket.io dependency remains unchanged, maintaining existing network functionality.

To ensure a smooth transition from execa to tinyexec, please run the following script to verify the usage:

This script will help identify any remaining execa imports, confirm tinyexec usage, and highlight areas where execa-like functionality might still be present.

✅ Verification successful

Re-running verification with corrected commands.

The previous script encountered errors due to the unrecognized file type typescript. Please run the following corrected script to verify the replacement of execa with tinyexec:

Verified: execa has been successfully replaced with tinyexec. No remaining execa imports found in the codebase. All execution functionalities are appropriately handled using tinyexec.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Description: Verify the replacement of execa with tinyexec

# Test 1: Check for any remaining imports of execa
echo "Checking for remaining execa imports:"
rg --type typescript 'import.*execa' apps/coordinator/src

# Test 2: Verify tinyexec usage
echo "Verifying tinyexec usage:"
rg --type typescript 'import.*tinyexec' apps/coordinator/src

# Test 3: Look for potential areas where execa might have been used
echo "Checking for potential areas where execa might have been used:"
rg --type typescript '(spawn|exec|execFile|execaCommand)' apps/coordinator/src
Length of output: 589

Script:
#!/bin/bash
# Description: Verify the replacement of execa with tinyexec

# Test 1: Check for any remaining imports of execa
echo "Checking for remaining execa imports:"
rg --type ts 'import.*execa' apps/coordinator/src

# Test 2: Verify tinyexec usage
echo "Verifying tinyexec usage:"
rg --type ts 'import.*tinyexec' apps/coordinator/src

# Test 3: Look for potential areas where execa might have been used
echo "Checking for potential areas where execa might have been used:"
rg --type ts '(spawn|exec|execFile|execaCommand)' apps/coordinator/src
Length of output: 3288
apps/coordinator/src/chaosMonkey.ts (2)

Line range hint 39-44: LGTM: Method signature simplification aligns with PR objectives.

The removal of the $?: Execa$<string> parameter from the call method signature is a positive change. It simplifies the method interface and aligns with the PR objective of replacing the execa library. This change contributes to reducing dependencies and streamlining the code.

Line range hint 1-92: Overall, these changes improve code quality and align with PR objectives.

The modifications to the ChaosMonkey class successfully simplify the implementation by removing the execa dependency and streamlining the call method. These changes contribute to:

Reduced complexity in the codebase.

Improved maintainability by removing external dependencies.

Consistent error handling.

The changes align well with the PR objectives of improving reliability and streamlining execution processes.

apps/coordinator/src/index.ts (2)

17-17: LGTM: Improved code organization

The removal of boolFromEnv and numFromEnv functions and their import from ./util is a good refactoring. This change improves code organization and maintainability by centralizing utility functions in a separate file.

Line range hint 1-1068: Overall code quality is good, with some areas for improvement

The TaskCoordinator class is well-structured and implements complex logic for managing tasks, checkpoints, and communication between workers and the platform. The code demonstrates good use of async/await, proper error handling, and extensive logging.

Main areas for improvement:

Error handling: Consider using more specific error types and providing more context in error messages.

Logging: Enhance logging in some methods to provide more detailed information for debugging.

Security: Add authentication to sensitive endpoints, particularly the /checkpoint endpoint.

These improvements will enhance the maintainability, debuggability, and security of the codebase.

apps/coordinator/src/cleaner.ts (1)

88-90: ⚠️ Potential issue

Ensure safety when executing rm -rf commands

Executing rm -rf commands with paths constructed from input can be dangerous if the inputs are not properly validated. There is a risk of deleting unintended files if startingPoint is not securely validated.

Consider adding validation to ensure that startingPoint is a safe and expected directory path. You might restrict it to a specific base directory or perform checks to prevent path traversal vulnerabilities.

Run the following script to verify the usage of startingPoint and ensure it's derived from trusted sources:

This script searches for all usages of TempFileCleaner and displays the paths being cleaned. Review the results to ensure they are safe.

apps/coordinator/src/checkpointer.ts (1)

306-306: Confirm that delay.milliseconds is the correct property

Ensure that the delay object provided by ExponentialBackoff has a milliseconds property. Using an incorrect property may result in unexpected delays or runtime errors.

Run this script to check if the delay object has a milliseconds property:

apps/coordinator/src/util.ts

apps/coordinator/src/cleaner.ts

apps/coordinator/src/checkpointer.ts

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 9437c7e and 57c1f5f.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (2)

apps/coordinator/src/checkpointer.ts (13 hunks)
apps/coordinator/src/cleaner.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

apps/coordinator/src/cleaner.ts

🧰 Additional context used

📓 Learnings (1)

apps/coordinator/src/checkpointer.ts (1)

Learnt from: nicktrn
PR: triggerdotdev/trigger.dev#1367
File: apps/coordinator/src/checkpointer.ts:588-631
Timestamp: 2024-09-30T11:35:37.554Z
Learning: Errors in the `#createDockerCheckpoint` method are handled in the caller, so additional error logging within this method is unnecessary.

🪛 Biome

apps/coordinator/src/checkpointer.ts

[error] 571-571: Unsafe usage of 'return'.

'return' in 'finally' overwrites the control flow statements inside 'try' and 'catch'.

(lint/correctness/noUnsafeFinally)

apps/coordinator/src/checkpointer.ts

commit 7d11123 Author: Eric Goldman <eric@sequin.io> Date: Mon Sep 30 17:54:06 2024 -0700 Add sequin guide (#1368) Co-authored-by: James Ritchie <james@trigger.dev> commit 8da495a Author: nicktrn <55853254+nicktrn@users.noreply.github.com> Date: Mon Sep 30 13:42:22 2024 +0100 Improve checkpoint reliability and cleanup of temp files (#1367) * improve cleanup reliability * improve logging * bye-bye execa * fix for trailing newlines * prettier errors * trim args and log output by default * fix archive cleanup * prevent potential memleak * more cleanup debug logs * ignore abort during cleanup * rename checkpoint dir env var and move to helper * add global never throw override * add tmp cleaner * also clean up checkpoint dir by default * split by any whitespace, not just tabs * only create tmp cleaner if paths to clean commit 69ec68e Author: Eric Allam <eallam@icloud.com> Date: Sun Sep 29 19:18:39 2024 -0700 Release 3.0.9 commit a6ea844 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Sun Sep 29 19:17:26 2024 -0700 chore: Update version for release (#1366) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> commit 4c1ee3d Author: Eric Allam <eallam@icloud.com> Date: Sun Sep 29 19:09:38 2024 -0700 fix: run metadata not working when using npx/pnpm dlx

* Added a new dropdown help and feedback menu to the side menu * Added a shortcut to the popover menu * Removed dev cli connected button for now * Contact us form uses original Feedback component to prevent broken links * Improved the messaging when selecting different options in the email form * buttons style tweak * SideMenuItem supports the trailingIconClassName * Adding a consistent focus-visible states * Removing tooltips for now * Squashed commit of the following: commit 7d11123 Author: Eric Goldman <eric@sequin.io> Date: Mon Sep 30 17:54:06 2024 -0700 Add sequin guide (#1368) Co-authored-by: James Ritchie <james@trigger.dev> commit 8da495a Author: nicktrn <55853254+nicktrn@users.noreply.github.com> Date: Mon Sep 30 13:42:22 2024 +0100 Improve checkpoint reliability and cleanup of temp files (#1367) * improve cleanup reliability * improve logging * bye-bye execa * fix for trailing newlines * prettier errors * trim args and log output by default * fix archive cleanup * prevent potential memleak * more cleanup debug logs * ignore abort during cleanup * rename checkpoint dir env var and move to helper * add global never throw override * add tmp cleaner * also clean up checkpoint dir by default * split by any whitespace, not just tabs * only create tmp cleaner if paths to clean commit 69ec68e Author: Eric Allam <eallam@icloud.com> Date: Sun Sep 29 19:18:39 2024 -0700 Release 3.0.9 commit a6ea844 Author: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Date: Sun Sep 29 19:17:26 2024 -0700 chore: Update version for release (#1366) Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> commit 4c1ee3d Author: Eric Allam <eallam@icloud.com> Date: Sun Sep 29 19:09:38 2024 -0700 fix: run metadata not working when using npx/pnpm dlx * More support for custom-focus * More custom focus styles added * Support for focus-visible style for the Segmented control * Fixed table triple dot menu z-index issue * Improved help menu wording * When you submit the help form, close the modal * focus-visible style for radio buttons * button prop is now optional in the SideMenu component * focus styling for a text link * Deleted unused sequin files

nicktrn added 17 commits September 26, 2024 16:39

improve cleanup reliability

2694e9a

Merge branch 'main' into checkpoint-cleanup

9fb9c6f

improve logging

dcd1743

bye-bye execa

25276f0

fix for trailing newlines

fd95693

prettier errors

adb3d44

trim args and log output by default

0cc518f

fix archive cleanup

e9415f5

prevent potential memleak

43430b3

more cleanup debug logs

34e4a64

ignore abort during cleanup

3967f96

rename checkpoint dir env var and move to helper

3f6c52f

add global never throw override

69a5e5b

add tmp cleaner

e124d3d

Merge remote-tracking branch 'origin/main' into checkpoint-cleanup

2221283

also clean up checkpoint dir by default

6ecf4ac

Merge remote-tracking branch 'origin/main' into checkpoint-cleanup

9437c7e

coderabbitai bot reviewed Sep 27, 2024

View reviewed changes

nicktrn added 3 commits September 30, 2024 13:09

split by any whitespace, not just tabs

6b75ba8

only create tmp cleaner if paths to clean

c900253

Merge remote-tracking branch 'origin/main' into checkpoint-cleanup

57c1f5f

coderabbitai bot reviewed Sep 30, 2024

View reviewed changes

apps/coordinator/src/checkpointer.ts Show resolved Hide resolved

apps/coordinator/src/checkpointer.ts Show resolved Hide resolved

nicktrn merged commit 8da495a into main Sep 30, 2024
9 checks passed

nicktrn deleted the checkpoint-cleanup branch September 30, 2024 12:42

coderabbitai bot mentioned this pull request Oct 8, 2024

Improve coordinator logs and extend structured logger #1389

Merged

coderabbitai bot mentioned this pull request Oct 18, 2024

Fix several restore and resume bugs #1418

Merged

This was referenced Feb 14, 2025

Add support for manual checkpoints #1709

Merged

Add support for deferred checkpoints #1721

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Improve checkpoint reliability and cleanup of temp files #1367

Improve checkpoint reliability and cleanup of temp files #1367

Uh oh!

nicktrn commented Sep 27, 2024 •

edited by coderabbitai bot

Loading

Uh oh!

changeset-bot bot commented Sep 27, 2024 •

edited

Loading

Uh oh!

coderabbitai bot commented Sep 27, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

pkg-pr-new bot commented Sep 27, 2024 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Improve checkpoint reliability and cleanup of temp files #1367

Improve checkpoint reliability and cleanup of temp files #1367

Uh oh!

Conversation

nicktrn commented Sep 27, 2024 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

changeset-bot bot commented Sep 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Sep 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

pkg-pr-new bot commented Sep 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

nicktrn commented Sep 27, 2024 •

edited by coderabbitai bot

Loading

changeset-bot bot commented Sep 27, 2024 •

edited

Loading

coderabbitai bot commented Sep 27, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

pkg-pr-new bot commented Sep 27, 2024 •

edited

Loading