Skip to content

Conversation

@yileicn
Copy link
Collaborator

@yileicn yileicn commented Nov 18, 2025

PR Type

Enhancement, Bug fix
The problem of Redis timeout causing job crashes


Description

  • Add exception handling to prevent service crashes

  • Implement try-catch block with logging and recovery

  • Increase delay to 3 seconds on error for resilience

  • Improve code structure with better error management


Diagram Walkthrough

flowchart LR
  A["CrontabWatcher Loop"] --> B["Try Block"]
  B --> C["Lock & Run Checker"]
  C --> D{Lock Acquired?}
  D -->|Yes| E["Execute Task"]
  D -->|No| F["Delay 1s"]
  E --> G["Delay 1s"]
  C --> H["Catch Exception"]
  H --> I["Log Warning"]
  I --> J["Delay 3s"]
  F --> A
  G --> A
  J --> A
Loading

File Walkthrough

Relevant files
Error handling
CrontabWatcher.cs
Add exception handling and error recovery                               

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabWatcher.cs

  • Wrapped main loop logic in try-catch block for exception handling
  • Added exception logging with descriptive warning message
  • Increased error recovery delay from 1 second to 3 seconds
  • Improved code readability with better indentation and structure
+15/-8   

@qodo-merge-pro
Copy link

qodo-merge-pro bot commented Nov 18, 2025

PR Compliance Guide 🔍

(Compliance updated until commit c6b476c)

Below is a summary of compliance checks for this PR:

Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Insufficient Auditing: The added error handling only logs a warning message without structured context (e.g.,
user/job identifiers), making it unclear if critical cron actions are fully audited.

Referred Code
try
{
    var isLocked = await locker.LockAsync(DIST_KEY, async () =>
    {
        await RunCronChecker(scope.ServiceProvider);
        await Task.Delay(1000, stoppingToken);
    });
    if (!isLocked)
    {
        await Task.Delay(1000, stoppingToken);
    }
}
catch (Exception ex)
{
    _logger.LogWarning($"Crontab Watcher background service error: {ex.Message}");
    await Task.Delay(3000, stoppingToken);
}

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Generic Warning: Exceptions are caught and only the message is logged without actionable context (e.g.,
lock key, retry count, exception type) which may hinder production debugging.

Referred Code
catch (Exception ex)
{
    _logger.LogWarning($"Crontab Watcher background service error: {ex.Message}");
    await Task.Delay(3000, stoppingToken);

Learn more about managing compliance generic rules or creating your own custom rules

Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

Previous compliance checks

Compliance check up to commit c6b476c
Security Compliance
🟢
No security concerns identified No security vulnerabilities detected by AI analysis. Human verification advised for critical code.
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status:
Action logging: The new error-handling path only logs a warning on exceptions and does not add audit-style
logs for critical actions like lock acquisition attempts, task execution outcomes, or lock
contention results, making full auditability unclear.

Referred Code
try
{
    var isLocked = await locker.LockAsync(DIST_KEY, async () =>
    {
        await RunCronChecker(scope.ServiceProvider);
        await Task.Delay(1000, stoppingToken);
    });
    if (!isLocked)
    {
        await Task.Delay(1000, stoppingToken);
    }
}
catch (Exception ex)
{
    _logger.LogWarning($"Crontab Watcher background service error: {ex.Message}");
    await Task.Delay(3000, stoppingToken);

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Generic warning: The catch block logs only ex.Message without context (e.g., which operation failed, lock
key, retries) and does not differentiate cancellation exceptions, which may hinder
actionable diagnostics.

Referred Code
catch (Exception ex)
{
    _logger.LogWarning($"Crontab Watcher background service error: {ex.Message}");
    await Task.Delay(3000, stoppingToken);

Learn more about managing compliance generic rules or creating your own custom rules

@qodo-merge-pro
Copy link

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
Possible issue
Prevent memory leaks from captive dependencies

To prevent memory leaks from captive dependencies, move the using (var scope =
...) block inside the while loop. This ensures a new scope is created for each
iteration and its services are properly disposed.

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabWatcher.cs [28-54]

-using (var scope = _services.CreateScope())
+while (!stoppingToken.IsCancellationRequested)
 {
-    var locker = scope.ServiceProvider.GetRequiredService<IDistributedLocker>();
+    try
+    {
+        using var scope = _services.CreateScope();
+        var locker = scope.ServiceProvider.GetRequiredService<IDistributedLocker>();
+        var isLocked = await locker.LockAsync(DIST_KEY, async () =>
+        {
+            await RunCronChecker(scope.ServiceProvider);
+            await Task.Delay(1000, stoppingToken);
+        });
 
-    while (!stoppingToken.IsCancellationRequested)
-    {
-        try
+        if (!isLocked)
         {
-            var isLocked = await locker.LockAsync(DIST_KEY, async () =>
-            {
-                await RunCronChecker(scope.ServiceProvider);
-                await Task.Delay(1000, stoppingToken);
-            });
-            if (!isLocked)
-            {
-                await Task.Delay(1000, stoppingToken);
-            }
-        }
-        catch (Exception ex)
-        {
-            _logger.LogWarning($"Crontab Watcher background service error: {ex.Message}");
-            await Task.Delay(3000, stoppingToken);
+            await Task.Delay(1000, stoppingToken);
         }
     }
-
-    _logger.LogWarning("Crontab Watcher background service is stopped.");
+    catch (Exception ex)
+    {
+        _logger.LogWarning(ex, "Crontab Watcher background service error");
+        await Task.Delay(3000, stoppingToken);
+    }
 }
 
+_logger.LogWarning("Crontab Watcher background service is stopped.");
+

[To ensure code accuracy, apply this suggestion manually]

Suggestion importance[1-10]: 9

__

Why: This suggestion identifies a critical "captive dependency" issue that can lead to memory leaks in a long-running background service. Moving the IServiceScope creation inside the loop is the correct fix and is crucial for application stability.

High
General
Log the full exception object

Improve exception logging by passing the full exception object to
_logger.LogWarning instead of just the message string to include the stack trace
and other details.

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabWatcher.cs [48]

-_logger.LogWarning($"Crontab Watcher background service error: {ex.Message}");
+_logger.LogWarning(ex, "Crontab Watcher background service error");
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that logging only ex.Message loses important debugging information and proposes using the standard ILogger overload to log the full exception, which is a significant improvement for observability.

Medium
Learned
best practice
Honor cancellation without leaking tokens

Avoid passing the loop's cancellation token into inner delays; instead await the
work and delay with WaitAsync(stoppingToken) and catch
OperationCanceledException to allow proper cleanup.

src/Infrastructure/BotSharp.Core.Crontab/Services/CrontabWatcher.cs [36-44]

-var isLocked = await locker.LockAsync(DIST_KEY, async () =>
+try
 {
-    await RunCronChecker(scope.ServiceProvider);
-    await Task.Delay(1000, stoppingToken);
-});
-if (!isLocked)
+    var isLocked = await locker.LockAsync(DIST_KEY, async () =>
+    {
+        await RunCronChecker(scope.ServiceProvider);
+        await Task.Delay(1000);
+    });
+    if (!isLocked)
+    {
+        await Task.Delay(1000).WaitAsync(stoppingToken);
+    }
+    else
+    {
+        await Task.Delay(0).WaitAsync(stoppingToken); // noop yield honoring cancellation
+    }
+}
+catch (OperationCanceledException)
 {
-    await Task.Delay(1000, stoppingToken);
+    throw;
 }
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why:
Relevant best practice - Ensure cooperative cancellation and cleanup by not passing cancellation tokens into Task.Run work scheduled inside loops; await with WaitAsync(cancellationToken) and catch OperationCanceledException so finally/cleanup can run.

Low
  • More

@adenchen123
Copy link
Contributor

Reviewed

@yileicn yileicn merged commit 1716ec0 into SciSharp:master Nov 18, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants