Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(core): Handle SIGTERM for graceful shutdowns #16882

Open
wants to merge 27 commits into
base: main
Choose a base branch
from

Conversation

AndesKrrrrrrrrrrr
Copy link
Member

@AndesKrrrrrrrrrrr AndesKrrrrrrrrrrr commented Nov 14, 2024

Stop forceful shutdowns in-cluster.

See https://expressjs.com/en/advanced/healthcheck-graceful-shutdown.html

Nest.js documentation has a section explicitly about this issue:

  • For these events, if you're not calling app.close() explicitly, you must opt-in to make them work with system signals such as SIGTERM. See Application shutdown below.

Summary by CodeRabbit

  • New Features
    • Introduced a graceful exit handling mechanism for the server, improving robustness during shutdown scenarios.
    • Added a utility function for managing server shutdown processes in response to termination signals.
    • Implemented logging for shutdown events to enhance visibility during server exits.
  • Bug Fixes
    • Improved error handling during server closure to ensure proper logging and exit status.
  • Refactor
    • Updated function signatures in route handlers to clarify unused parameters, improving code readability.

Copy link
Contributor

coderabbitai bot commented Nov 14, 2024

Walkthrough

The pull request introduces a new function, setupShutdownHooks, to manage graceful server termination by listening for specific signals (SIGHUP, SIGINT, SIGTERM). This function is imported and utilized in various server files to enhance shutdown handling. Additionally, several route handler and middleware function signatures are updated to indicate unused parameters by prefixing them with an underscore. These changes improve the server's shutdown capabilities and clarify code intent regarding unused parameters.

Changes

File Path Change Summary
libs/infra-express-server/src/lib/infra-express-server.ts - Added import for setupShutdownHooks.
- Updated function signatures to use _req and _next for unused parameters in route handlers and middleware.
- Called setupShutdownHooks(server) in runServer.
libs/infra-next-server/src/lib/bootstrap.ts - Added import for setupShutdownHooks.
- Captured server instance in startServer and passed it to setupShutdownHooks.
- Removed setupExitHook function.
libs/infra-nest-server/src/lib/bootstrap.ts - Added app.enableShutdownHooks() in createApp for graceful shutdown handling.
libs/node/utils/.eslintrc.json - Added new ESLint configuration file extending base settings with ignore patterns.
libs/node/utils/README.md - Documented setupShutdownHooks function and its parameters.
libs/node/utils/jest.config.ts - Added Jest configuration for the node-utils library.
libs/node/utils/project.json - Introduced project configuration for the node-utils library with targets for linting and testing.
libs/node/utils/src/index.ts - Re-exported all entities from ./lib/setupShutdownHooks.
libs/node/utils/src/lib/setupShutdownHooks.spec.ts - Added test suite for setupShutdownHooks function using Jest.
libs/node/utils/src/lib/setupShutdownHooks.ts - Introduced setupShutdownHooks function to manage server shutdown processes.
libs/node/utils/tsconfig.json - Added new TypeScript configuration file extending from base configuration.
libs/node/utils/tsconfig.lib.json - Added new TypeScript configuration file for library compilation settings.
libs/node/utils/tsconfig.spec.json - Added new TypeScript configuration file for testing specifications.
tsconfig.base.json - Added path mapping for @island.is/node-utils to libs/node/utils/src/index.ts.

Possibly related PRs

Suggested reviewers

  • lodmfjord
  • thorkellmani
  • baering

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between fdd8a2f and 4974ecb.

📒 Files selected for processing (2)
  • libs/node/utils/src/lib/setupShutdownHooks.spec.ts (1 hunks)
  • libs/node/utils/src/lib/setupShutdownHooks.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • libs/node/utils/src/lib/setupShutdownHooks.spec.ts
  • libs/node/utils/src/lib/setupShutdownHooks.ts

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 35.68%. Comparing base (8140945) to head (993bc52).

Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main   #16882   +/-   ##
=======================================
  Coverage   35.68%   35.68%           
=======================================
  Files        6941     6941           
  Lines      147233   147234    +1     
  Branches    41863    41863           
=======================================
+ Hits        52540    52542    +2     
+ Misses      94693    94692    -1     
Flag Coverage Δ
air-discount-scheme-backend 47.88% <0.00%> (-0.02%) ⬇️
air-discount-scheme-web 0.00% <ø> (ø)
api 3.34% <ø> (ø)
api-domains-air-discount-scheme 37.12% <ø> (ø)
api-domains-auth-admin 48.49% <ø> (ø)
api-domains-communications 39.53% <ø> (ø)
application-api-files 62.14% <ø> (ø)
application-core 70.91% <ø> (ø)
application-system-api 38.73% <0.00%> (+0.01%) ⬆️
application-template-api-modules 27.79% <ø> (ø)
application-templates-accident-notification 28.97% <ø> (ø)
application-templates-car-recycling 3.12% <ø> (ø)
application-templates-criminal-record 25.86% <ø> (ø)
application-templates-driving-license 18.18% <ø> (ø)
application-templates-estate 13.75% <ø> (ø)
application-templates-example-payment 24.80% <ø> (ø)
application-templates-financial-aid 15.80% <ø> (ø)
application-templates-general-petition 23.09% <ø> (ø)
application-templates-inheritance-report 6.57% <ø> (ø)
application-templates-marriage-conditions 15.12% <ø> (ø)
application-templates-mortgage-certificate 43.27% <ø> (ø)
application-templates-parental-leave 30.00% <ø> (+0.07%) ⬆️
application-types 6.51% <ø> (ø)
application-ui-components 1.27% <ø> (ø)
application-ui-shell 21.44% <ø> (ø)
auth-admin-web 2.43% <ø> (ø)
clients-charge-fjs-v2 24.11% <ø> (ø)
contentful-apps 4.74% <ø> (ø)
financial-aid-backend 51.29% <ø> (ø)
icelandic-names-registry-backend 54.30% <0.00%> (-0.05%) ⬇️
infra-nest-server 48.43% <100.00%> (+0.06%) ⬆️
judicial-system-api 20.36% <ø> (ø)
judicial-system-backend 55.65% <ø> (ø)
judicial-system-web 28.00% <ø> (ø)
license-api 42.64% <0.00%> (+0.04%) ⬆️
nest-problem 45.77% <0.00%> (-0.05%) ⬇️
nest-swagger 51.71% <ø> (ø)
services-auth-admin-api 52.56% <0.00%> (-0.01%) ⬇️
services-auth-delegation-api 58.55% <0.00%> (+0.07%) ⬆️
services-auth-ids-api 52.28% <0.00%> (+<0.01%) ⬆️
services-auth-public-api 49.24% <0.00%> (-0.01%) ⬇️
services-sessions 65.49% <0.00%> (-0.03%) ⬇️
services-university-gateway 49.34% <0.00%> (-0.02%) ⬇️
services-user-notification 46.69% <0.00%> (-0.01%) ⬇️
services-user-profile 56.80% <0.00%> (-0.02%) ⬇️
skilavottord-ws 24.14% <ø> (ø)
web 2.44% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
libs/infra-nest-server/src/lib/bootstrap.ts 76.05% <100.00%> (+0.34%) ⬆️

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8140945...993bc52. Read the comment docs.

@datadog-island-is
Copy link

datadog-island-is bot commented Nov 14, 2024

Datadog Report

All test runs 018c57f 🔗

51 Total Test Services: 0 Failed, 44 Passed
🔻 Test Sessions change in coverage: 6 decreased, 2 increased, 155 no change

Test Services
This report shows up to 10 services
Service Name Failed Known Flaky New Flaky Passed Skipped Total Time Code Coverage Change Test Service View
air-discount-scheme-backend 0 0 0 63 0 25.52s 1 no change Link
air-discount-scheme-web 0 0 0 2 0 6.97s 1 no change Link
api 0 0 0 4 0 2.46s 1 no change Link
api-domains-air-discount-scheme 0 0 0 6 0 16.21s 1 no change Link
api-domains-auth-admin 0 0 0 18 0 10.19s 1 no change Link
api-domains-communications 0 0 0 5 0 29.48s 1 no change Link
application-api-files 0 0 0 2 0 4.34s 1 no change Link
application-core 0 0 0 97 0 15.28s 1 decreased (-0.58%) Link
application-system-api 0 0 0 40 0 2m 18.8s 1 no change Link
application-template-api-modules 0 0 0 118 0 2m 4.07s 1 no change Link

🔻 Code Coverage Decreases vs Default Branch (6)

This report shows up to 5 code coverage decreases.

  • application-core - jest 77.97% (-0.58%) - Details
  • nest-problem - jest 50.35% (-0.06%) - Details
  • icelandic-names-registry-backend - jest 59.73% (-0.05%) - Details
  • services-sessions - jest 68.96% (-0.04%) - Details
  • services-user-profile - jest 49.04% (-0.03%) - Details

@AndesKrrrrrrrrrrr AndesKrrrrrrrrrrr changed the title fix(infra): Handle SIGTERM for graceful shutdowns fix(core): Handle SIGTERM for graceful shutdowns Nov 14, 2024
@AndesKrrrrrrrrrrr AndesKrrrrrrrrrrr added the automerge Merge this PR as soon as all checks pass label Nov 28, 2024
@AndesKrrrrrrrrrrr AndesKrrrrrrrrrrr marked this pull request as ready for review November 28, 2024 09:27
@AndesKrrrrrrrrrrr AndesKrrrrrrrrrrr requested a review from a team as a code owner November 28, 2024 09:27
@AndesKrrrrrrrrrrr AndesKrrrrrrrrrrr added the deploy-feature Deploys features to dev label Nov 28, 2024
Copy link
Contributor

github-actions bot commented Nov 28, 2024

Affected services are: air-discount-scheme-api,air-discount-scheme-backend,api,application-system-api,download-service,financial-aid-api,financial-aid-backend,financial-aid-open-api,icelandic-names-registry-backend,judicial-system-api,judicial-system-digital-mailbox-api,judicial-system-backend,judicial-system-xrd-api,license-api,reference-backend,regulations-admin-backend,services-auth-admin-api,services-auth-ids-api,services-auth-delegation-api,services-auth-personal-representative,services-auth-personal-representative-public,services-auth-public-api,services-bff,services-documents,services-endorsements-api,services-form-system,services-search-indexer,services-sessions,services-university-gateway,services-user-notification,services-user-profile,skilavottord-ws,air-discount-scheme-web,auth-admin-web,consultation-portal,contentful-apps,financial-aid-web-osk,financial-aid-web-veita,judicial-system-web,skilavottord-web,web,application-system-form,island-ui-storybook,portals-admin,service-portal,system-e2e,
Feature deployment of your services will begin shortly. Your feature will be accessible here:
https://fixnode-sigterm-api-catalogue.dev01.devland.is/api
https://fixnode-sigterm-api.dev01.devland.is/download
https://fixnode-sigterm-application-callback-xrd.internal.dev01.devland.is/application-payment
https://fixnode-sigterm-application-callback-xrd.internal.dev01.devland.is/applications
https://fixnode-sigterm-application-payment-callback-xrd.internal.dev01.devland.is/application-payment
https://fixnode-sigterm-application-payment-callback-xrd.internal.dev01.devland.is/applications
https://fixnode-sigterm-beta.dev01.devland.is/
https://fixnode-sigterm-beta.dev01.devland.is/api
https://fixnode-sigterm-beta.dev01.devland.is/app/skilavottord/
https://fixnode-sigterm-beta.dev01.devland.is/app/skilavottord/api/graphql
https://fixnode-sigterm-beta.dev01.devland.is/minarsidur
https://fixnode-sigterm-beta.dev01.devland.is/samradsgatt
https://fixnode-sigterm-beta.dev01.devland.is/stjornbord
https://fixnode-sigterm-beta.dev01.devland.is/stjornbord/bff
https://fixnode-sigterm-beta.dev01.devland.is/umsoknir
https://fixnode-sigterm-license-api-xrd.internal.dev01.devland.is/
https://fixnode-sigterm-loftbru-cf.dev01.devland.is/
https://fixnode-sigterm-loftbru-cf.dev01.devland.is/api/graphql
https://fixnode-sigterm-loftbru-cf.dev01.devland.is/api/public
https://fixnode-sigterm-loftbru-cf.dev01.devland.is/api/swagger
https://fixnode-sigterm-loftbru.dev01.devland.is/
https://fixnode-sigterm-loftbru.dev01.devland.is/api/graphql
https://fixnode-sigterm-loftbru.dev01.devland.is/api/public
https://fixnode-sigterm-loftbru.dev01.devland.is/api/swagger
https://fixnode-sigterm-service-portal-api.internal.dev01.devland.is/
https://fixnode-sigterm-services-university-gateway.internal.dev01.devland.is/api
https://fixnode-sigterm-sessions-api.internal.dev01.devland.is/
https://fixnode-sigterm-ui.dev01.devland.is/

Deployed services: application-system-api,application-system-form,service-portal,service-portal-api,portals-admin,consultation-portal,api,web,skilavottord-web,skilavottord-ws,services-documents,island-ui-storybook,download-service,icelandic-names-registry-backend,endorsement-system-api,air-discount-scheme-web,air-discount-scheme-backend,air-discount-scheme-api,regulations-admin-backend,application-system-api-worker,license-api,services-sessions,services-sessions-worker,services-sessions-cleanup,services-university-gateway,services-university-gateway-worker,services-bff-portals-admin.
Excluded services: search-indexer-service,user-notification,user-notification-worker,user-notification-cleanup-worker,contentful-apps

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (3)
libs/infra-next-server/src/lib/bootstrap.ts (2)

Line range hint 108-116: Enhance error handling type safety and developer experience

While the error logging is comprehensive, there's room for improvement in type safety and developer experience.

Consider these enhancements:

-    } catch (err: any) {
+    } catch (err: unknown) {
+      const error = err instanceof Error ? err : new Error(String(err))
       logger.error('Error in Next.js request handler!', {
-        message: err.message,
-        stack: err.stack,
-        err,
+        message: error.message,
+        stack: error.stack,
+        error,
       })
-      res.status(500).send('Internal Server Error')
+      res.status(500).send(
+        dev ? `Internal Server Error: ${error.message}` : 'Internal Server Error'
+      )
     }

Line range hint 1-144: Consider improving module architecture for better reusability

As this module is in the libs directory, it should prioritize reusability and separation of concerns.

Consider these architectural improvements:

  1. Extract server lifecycle management into a separate class:
class NextServerLifecycle {
  private server: Express;
  private nextApp: NextApp;

  async start(options: BootstrapOptions): Promise<void> {
    // initialization logic
  }

  async shutdown(): Promise<void> {
    // graceful shutdown logic
  }
}
  1. Return the lifecycle instance from bootstrap:
export const bootstrap = async (options: BootstrapOptions) => {
  const lifecycle = new NextServerLifecycle();
  await lifecycle.start(options);
  return lifecycle;
}

This would:

  • Make the server state manageable
  • Improve testability
  • Allow proper cleanup in consuming applications
libs/infra-express-server/src/lib/infra-express-server.ts (1)

84-89: Consider using a custom error type instead of any

While the eslint disable comment acknowledges the any usage, consider creating a custom error type for better type safety and documentation.

Example implementation:

interface AppError extends Error {
  status?: number;
  // Add other common error properties
}

function errorHandler(
  err: AppError,
  _req: Request,
  res: Response,
  _next: NextFunction,
) {
  // ... existing implementation
}
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5994adc and 75bdd29.

📒 Files selected for processing (2)
  • libs/infra-express-server/src/lib/infra-express-server.ts (5 hunks)
  • libs/infra-next-server/src/lib/bootstrap.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
libs/infra-express-server/src/lib/infra-express-server.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
libs/infra-next-server/src/lib/bootstrap.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
🔇 Additional comments (2)
libs/infra-express-server/src/lib/infra-express-server.ts (2)

61-65: LGTM! Parameter naming improved for clarity

The renaming of unused parameters to _req follows TypeScript conventions and improves code readability.


105-105: LGTM! Verify server shutdown behavior

The placement of setupExitHook after server setup is correct. However, let's verify the shutdown behavior across the codebase.

✅ Verification successful

Let me gather more information about signal handlers with a corrected regex pattern.


Let me search for the setupExitHook implementation to verify the shutdown behavior.


Let me search for the setupExitHook implementation in the specific file and its imports.


LGTM! Server shutdown behavior is properly implemented

The setupExitHook implementation correctly handles server shutdown by:

  • Setting up handlers for SIGHUP, SIGINT, and SIGTERM signals to gracefully exit
  • Including special handling for test environments via NX_INVOKED_BY_RUNNER check
  • Using proper logging to track shutdown events
  • Calling process.exit(0) to ensure clean termination
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for other signal handlers or shutdown logic that might conflict

# Search for signal handling
rg -A 5 "process\.on\((\'|\")SIG"

# Search for server shutdown logic
ast-grep --pattern 'server.close($$$)'

Length of output: 574


Script:

#!/bin/bash
# Search for signal handling with corrected regex
rg -A 5 "process\.on\(['\"]SIGTERM|process\.on\(['\"]SIGINT"

# Search for setupExitHook implementation
ast-grep --pattern 'function setupExitHook($$$)'

# Also check for any shutdown or cleanup functions
rg -A 5 "shutdown|cleanup|graceful.*stop"

Length of output: 153750


Script:

#!/bin/bash
# Search for setupExitHook implementation
ast-grep --pattern 'function setupExitHook($$$)'

# Also search for any signal handlers or cleanup code
rg -A 5 'process\.on\((["'\''])(SIGTERM|SIGINT)'

Length of output: 274


Script:

#!/bin/bash
# Check the file content and its imports
cat libs/infra-express-server/src/lib/infra-express-server.ts

# Search for any files importing setupExitHook
rg "setupExitHook"

Length of output: 3331

libs/infra-next-server/src/lib/bootstrap.ts Outdated Show resolved Hide resolved
libs/infra-express-server/src/lib/infra-express-server.ts Outdated Show resolved Hide resolved
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (3)
libs/infra-next-server/src/lib/bootstrap.ts (2)

Line range hint 32-43: Restructure server initialization for proper shutdown handling

The startServer function creates the HTTP server but doesn't return it, making it impossible for the shutdown hook to properly close the server.

Consider restructuring the code to maintain the server reference:

-const startServer = (app: Express, port = 4200) => {
+const startServer = (app: Express, port = 4200): Promise<http.Server> => {
   const nextPort = parseInt(process.env.PORT || '') || port
   const metricsPort = nextPort + 1
-  app.listen(nextPort, () => {
-    logger.info(
-      `Next custom server listening at http://localhost:${nextPort}`,
-      {
-        context: 'Bootstrap',
-      },
-    )
-  })
-  startMetricServer(metricsPort)
+  return new Promise((resolve) => {
+    const server = app.listen(nextPort, () => {
+      logger.info(
+        `Next custom server listening at http://localhost:${nextPort}`,
+        {
+          context: 'Bootstrap',
+        },
+      )
+      startMetricServer(metricsPort)
+      resolve(server)
+    })
+  })
 }

-const setupExitHook = () => {
+const setupExitHook = (server: http.Server) => {
   // ... signal handling code ...
 }

 export const bootstrap = async (options: BootstrapOptions) => {
   // ...
-  setupExitHook()
   const expressApp = createExpressApp()
   // ... other setup code ...
-  startServer(expressApp, options.port)
+  const server = await startServer(expressApp, options.port)
+  setupExitHook(server)
   // ...
 }

Also applies to: 78-82


Line range hint 95-108: Add request timeout handling for graceful shutdowns

Long-running requests could prevent graceful shutdown from completing. Consider adding timeout handling to ensure requests don't hang indefinitely.

Add timeout middleware:

+const REQUEST_TIMEOUT = 5000 // 5 seconds

+expressApp.use((req, res, next) => {
+  res.setTimeout(REQUEST_TIMEOUT, () => {
+    logger.warning('Request timeout exceeded', {
+      url: req.url,
+      method: req.method,
+      timeout: REQUEST_TIMEOUT,
+    })
+    res.status(504).send('Gateway Timeout')
+  })
+  next()
+})

 expressApp.use(async (req, res) => {
   // ... existing code ...
 })
libs/infra-express-server/src/lib/infra-express-server.ts (1)

61-61: Add error handling for version endpoint

While the parameter naming changes look good, the /version endpoint should handle cases where REVISION environment variable is undefined.

Consider this improvement:

 app.get('/version', function versionOfCode(_req, res) {
-  res.json({ version: process.env.REVISION })
+  res.json({ version: process.env.REVISION || 'unknown' })
 }

Also applies to: 65-65

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5994adc and 75bdd29.

📒 Files selected for processing (2)
  • libs/infra-express-server/src/lib/infra-express-server.ts (5 hunks)
  • libs/infra-next-server/src/lib/bootstrap.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
libs/infra-express-server/src/lib/infra-express-server.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
libs/infra-next-server/src/lib/bootstrap.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
🔇 Additional comments (1)
libs/infra-express-server/src/lib/infra-express-server.ts (1)

Line range hint 76-88: Security middleware requires immediate attention

The current middleware allows all requests to pass through without any authentication, which could pose a security risk in production.

Let's verify if there are any authenticated routes that might be exposed:

Consider implementing proper authentication before deploying to production. Here's a basic structure:

app.use((req, res, next) => {
  const token = req.headers.authorization?.split(' ')[1]
  if (!token) {
    return res.status(401).json({ error: 'Authentication required' })
  }
  // Verify token and proceed
  next()
})

libs/infra-next-server/src/lib/bootstrap.ts Outdated Show resolved Hide resolved
libs/infra-express-server/src/lib/infra-express-server.ts Outdated Show resolved Hide resolved
libs/infra-express-server/src/lib/infra-express-server.ts Outdated Show resolved Hide resolved
Copy link
Member

@eirikurn eirikurn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was checking the other day and I'm not sure if any of our servers are using infra-express-server. It might have been a blueprint which went unused.

I also think this SIGTERM handling is worse then the current SIGKILL fallback. There's nothing graceful about process.exit(0). Afaik it will just abruptly exit the process and close all open connections, resulting in socket issues.

Currently at least there's a chance that current requests finish and the load balancer stops sending new requests to the pod before its eventual SIGKILL. 🤔

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (11)
libs/node/utils/src/lib/setupShutdownHooks.ts (3)

8-14: Consider adding SIGQUIT signal handler

While the current signal handling covers the most common scenarios, consider adding SIGQUIT for complete signal coverage, especially in Unix-like environments.

-  const terminationEvents = ['SIGHUP', 'SIGINT', 'SIGTERM']
+  const terminationEvents = ['SIGHUP', 'SIGINT', 'SIGTERM', 'SIGQUIT']

37-39: Consider cleanup of signal handlers

While the current implementation works, it's good practice to clean up signal handlers when the server closes to prevent memory leaks in long-running applications that might restart the server.

+  const cleanupListeners = () => {
+    for (const signal of terminationEvents) {
+      process.removeListener(signal, () => shutdown(signal))
+    }
+  }
+
   for (const signal of terminationEvents) {
     process.on(signal, () => shutdown(signal))
   }
+
+  server.on('close', cleanupListeners)

1-40: Add comprehensive documentation and testing guidelines

As this is a critical utility function in the shared library, consider:

  1. Adding JSDoc documentation with examples
  2. Creating integration tests with different signal scenarios
  3. Documenting recommended usage patterns in the README

This will help other developers properly integrate the shutdown hooks in their services.

libs/node/utils/README.md (3)

9-10: Improve parameter documentation clarity

The onShutdown parameter description should be restructured for better clarity.

-- `onShutdown` (Function): Optional callback function that will be executed before the server shuts down. Can be used to perform and wait for other cleanup tasks before the server shuts down.
++ `onShutdown` (Function): Optional callback function that will be executed before the server shuts down. It can be used to perform and wait for other cleanup tasks.
🧰 Tools
🪛 LanguageTool

[style] ~10-~10: To form a complete sentence, be sure to include a subject.
Context: ... executed before the server shuts down. Can be used to perform and wait for other c...

(MISSING_IT_THERE)


16-20: Add error handling details to behavior documentation

The behavior section should include information about error handling scenarios and potential error types.

Consider adding the following details:

  • Types of errors that might occur during shutdown
  • How errors are propagated
  • Whether the onShutdown callback errors are caught and handled

22-24: Enhance testing documentation

The testing section would benefit from examples of specific test cases that users should implement.

Consider adding:

  • Test scenarios for different termination signals
  • Examples of mocking the server and shutdown process
  • Test cases for error conditions
  • Examples of testing the onShutdown callback functionality
libs/infra-express-server/src/lib/infra-express-server.ts (1)

69-72: Consider enhancing the error handler implementation

While the parameter naming is correct, the error handler could benefit from improvements:

+interface ApiError extends Error {
+  status?: number;
+}

 app.use(function errorHandler(
-  err: any,
+  err: ApiError,
   _req: Request,
   res: Response,
   _next: NextFunction,
 ) {
   logger.error(`Status code: ${err.status}, msg: ${err.message}`)
   res.status(err.status || 500)
-  res.send(err.message)
+  res.json({
+    error: {
+      message: err.message,
+      status: err.status || 500
+    }
+  })
 }
libs/infra-nest-server/src/lib/bootstrap.ts (1)

49-51: Consider documenting shutdown behavior with keep-alive connections.

While the implementation is correct, the interaction between the 30-second KEEP_ALIVE_TIMEOUT and shutdown signals could affect shutdown timing. Long-lived connections might delay the shutdown process.

Consider:

  1. Adding a comment documenting this interaction
  2. Implementing a maximum shutdown timeout
  3. Logging when shutdown is initiated and completed

Here's a suggested comment addition:

  // Handle signals like SIGINT, SIGTERM and SIGHUP gracefully.
+ // Note: Active keep-alive connections (timeout: 30s) may delay shutdown
  app.enableShutdownHooks()

Also applies to: 18-19

libs/node/utils/src/lib/setupShutdownHooks.spec.ts (3)

18-37: Enhance type safety in test setup.

While the test setup is comprehensive, consider improving type safety:

-    server = {
-      close: jest.fn((callback) => callback()),
-    } as unknown as Server
+    server = {
+      close: jest.fn((callback?: (err?: Error) => void) => {
+        if (callback) callback();
+        return server;
+      }),
+    } as Partial<Server> as Server

39-86: Consider extracting common test patterns.

The signal handler extraction is repeated across multiple tests. Consider creating a helper function:

const getSignalHandler = (signal: string, spy: jest.SpyInstance): Function => {
  return spy.mock.calls.find(([event]) => event === signal)[1];
};

88-151: Comprehensive error handling test coverage.

The test cases thoroughly cover error scenarios and callback handling. Great job on including both successful and failure paths for the onShutdown callback.

However, consider adding one more test case:

it('should handle multiple signals in sequence', async () => {
  setupShutdownHooks(server);
  const sigTermHandler = getSignalHandler('SIGTERM', processOnSpy);
  const sigIntHandler = getSignalHandler('SIGINT', processOnSpy);
  
  await sigTermHandler();
  await sigIntHandler();
  
  expect(server.close).toHaveBeenCalledTimes(1); // Should only close once
  expect(processExitSpy).toHaveBeenCalledTimes(1);
});
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 75bdd29 and fdd8a2f.

📒 Files selected for processing (14)
  • libs/infra-express-server/src/lib/infra-express-server.ts (5 hunks)
  • libs/infra-nest-server/src/lib/bootstrap.ts (1 hunks)
  • libs/infra-next-server/src/lib/bootstrap.ts (2 hunks)
  • libs/node/utils/.eslintrc.json (1 hunks)
  • libs/node/utils/README.md (1 hunks)
  • libs/node/utils/jest.config.ts (1 hunks)
  • libs/node/utils/project.json (1 hunks)
  • libs/node/utils/src/index.ts (1 hunks)
  • libs/node/utils/src/lib/setupShutdownHooks.spec.ts (1 hunks)
  • libs/node/utils/src/lib/setupShutdownHooks.ts (1 hunks)
  • libs/node/utils/tsconfig.json (1 hunks)
  • libs/node/utils/tsconfig.lib.json (1 hunks)
  • libs/node/utils/tsconfig.spec.json (1 hunks)
  • tsconfig.base.json (1 hunks)
✅ Files skipped from review due to trivial changes (7)
  • libs/node/utils/.eslintrc.json
  • libs/node/utils/jest.config.ts
  • libs/node/utils/project.json
  • libs/node/utils/src/index.ts
  • libs/node/utils/tsconfig.json
  • libs/node/utils/tsconfig.lib.json
  • libs/node/utils/tsconfig.spec.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • libs/infra-next-server/src/lib/bootstrap.ts
🧰 Additional context used
📓 Path-based instructions (5)
libs/infra-express-server/src/lib/infra-express-server.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
libs/infra-nest-server/src/lib/bootstrap.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
libs/node/utils/README.md (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
libs/node/utils/src/lib/setupShutdownHooks.spec.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
libs/node/utils/src/lib/setupShutdownHooks.ts (1)

Pattern libs/**/*: "Confirm that the code adheres to the following:

  • Reusability of components and hooks across different NextJS apps.
  • TypeScript usage for defining props and exporting types.
  • Effective tree-shaking and bundling practices."
🪛 LanguageTool
libs/node/utils/README.md

[style] ~10-~10: To form a complete sentence, be sure to include a subject.
Context: ... executed before the server shuts down. Can be used to perform and wait for other c...

(MISSING_IT_THERE)

🔇 Additional comments (9)
libs/node/utils/src/lib/setupShutdownHooks.ts (1)

1-7: LGTM! Well-structured function signature with proper typing

The function signature is well-designed with TypeScript types and an optional callback for custom cleanup operations.

libs/node/utils/README.md (1)

14-15: Verify NX runner detection implementation

The documentation mentions NX runner detection using an environment variable. Let's verify this implementation.

✅ Verification successful

Let me gather more context about the implementation to ensure a complete verification.


NX runner detection and disconnect event handling are correctly implemented

The implementation in setupShutdownHooks.ts accurately matches the documentation:

  • Checks for NX runner using process.env.NX_INVOKED_BY_RUNNER === 'true'
  • Adds 'disconnect' event to termination events when running under NX
  • Implementation is well-tested with dedicated test cases verifying this behavior
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for the NX runner environment variable usage
rg -A 2 "NX_INVOKED_BY_RUNNER" 

# Look for the disconnect event handler implementation
ast-grep --pattern 'process.on("disconnect"'

Length of output: 1216


Script:

#!/bin/bash
# Get the full implementation of setupShutdownHooks
cat libs/node/utils/src/lib/setupShutdownHooks.ts

# Get the full test implementation
cat libs/node/utils/src/lib/setupShutdownHooks.spec.ts

Length of output: 5125

libs/infra-express-server/src/lib/infra-express-server.ts (3)

45-50: LGTM! Parameter naming follows best practices

The underscore prefix for unused parameters follows TypeScript conventions and improves code clarity.


88-88: Verify graceful shutdown behavior

The addition of setupShutdownHooks addresses the PR objective, but we should verify it properly handles in-flight requests during shutdown.

✅ Verification successful

Graceful shutdown implementation is properly handled

The setupShutdownHooks implementation correctly handles in-flight requests by:

  • Using server.close() which waits for existing connections to complete before shutting down
  • Preventing multiple shutdown attempts with the isShuttingDown flag
  • Supporting custom cleanup through the optional onShutdown callback
  • Properly handling termination signals (SIGHUP, SIGINT, SIGTERM)
  • Logging shutdown progress and errors for observability
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if setupShutdownHooks properly handles in-flight requests
# Look for timeout configurations and request handling during shutdown
ast-grep --pattern 'setupShutdownHooks($_) {
  $$$
}'

Length of output: 55


Script:

#!/bin/bash
# Search for setupShutdownHooks function definition
rg -A 10 "setupShutdownHooks" --type ts

Length of output: 10770


Script:

#!/bin/bash
# Get the full implementation of setupShutdownHooks
cat libs/node/utils/src/lib/setupShutdownHooks.ts

Length of output: 1102


1-1: Verify the implementation of setupShutdownHooks

Let's ensure the imported function properly handles graceful shutdowns.

✅ Verification successful

Implementation of setupShutdownHooks is properly integrated

The implementation is well-tested and correctly used in the Express server:

  • Handles standard termination signals (SIGHUP, SIGINT, SIGTERM)
  • Includes special handling for NX runner environments
  • Provides graceful shutdown with proper server closure
  • Supports optional cleanup through onShutdown callback
  • Correctly integrated at the end of runServer function
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check the implementation of setupShutdownHooks
# Look for the implementation and documentation
fd -e md -e ts "node-utils" | xargs rg -l "setupShutdownHooks"

Length of output: 314


Script:

#!/bin/bash
# Check the implementation and documentation of setupShutdownHooks
echo "=== Implementation of setupShutdownHooks ==="
ast-grep --pattern 'export function setupShutdownHooks($_) { $$$ }'

echo -e "\n=== README Documentation ==="
cat libs/node/utils/README.md

echo -e "\n=== Test Cases ==="
cat libs/node/utils/src/lib/setupShutdownHooks.spec.ts

Length of output: 5686


Script:

#!/bin/bash
# Check the actual implementation file and usage in express server
echo "=== Implementation File ==="
cat libs/node/utils/src/lib/setupShutdownHooks.ts

echo -e "\n=== Usage in Express Server ==="
rg -A 5 "setupShutdownHooks" libs/infra-express-server/src/lib/infra-express-server.ts

Length of output: 1709

libs/infra-nest-server/src/lib/bootstrap.ts (2)

49-51: LGTM! Graceful shutdown implementation follows best practices.

The addition of app.enableShutdownHooks() properly implements signal handling for graceful shutdowns, which aligns with the PR objectives. This ensures that the application can properly clean up resources and complete ongoing requests before terminating.


49-51: Verify shutdown cleanup in the close method.

The close method handles cleanup of both the main app and metrics server. Let's verify it properly awaits all cleanup operations.

Also applies to: 156-164

✅ Verification successful

Shutdown cleanup is properly implemented

The close method correctly handles the cleanup by:

  • Awaiting the NestJS application shutdown with app.close()
  • Properly closing the metrics server using a Promise-based approach
  • Following the correct order: first app, then metrics server
  • Using enableShutdownHooks() to handle system signals appropriately

The implementation ensures all resources are cleaned up gracefully during shutdown.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check for potential shutdown-related issues in the codebase

# Look for other cleanup operations that might need to be included
rg -A 5 "beforeApplicationShutdown|onApplicationShutdown|close\(\)" --type typescript

# Check for any custom shutdown hooks that might need coordination
ast-grep --pattern 'app.enableShutdownHooks()' 

Length of output: 810


Script:

#!/bin/bash
# Let's check the close method implementation and any shutdown hooks
ast-grep --pattern 'close() {
  $$$
}'

# Also check for any beforeApplicationShutdown or onApplicationShutdown implementations
ast-grep --pattern 'beforeApplicationShutdown() {
  $$$
}'

ast-grep --pattern 'onApplicationShutdown() {
  $$$
}'

# Let's also check the specific file for the close method
cat libs/infra-nest-server/src/lib/bootstrap.ts

Length of output: 4705

libs/node/utils/src/lib/setupShutdownHooks.spec.ts (1)

1-11: LGTM! Clean test setup with proper mocking.

The test setup follows best practices with clear imports and focused mocking of the logger functionality.

tsconfig.base.json (1)

956-956: LGTM! Path mapping correctly configured for node-utils module.

The new path mapping for @island.is/node-utils is properly configured and aligns with the PR's objective of implementing graceful shutdowns.

Let's verify the module's usage:

✅ Verification successful

Path mapping correctly configured and actively used in the codebase

The verification confirms that:

  • The @island.is/node-utils module is properly imported in two server implementations:
    • libs/infra-next-server/src/lib/bootstrap.ts
    • libs/infra-express-server/src/lib/infra-express-server.ts
  • The implementation file exists at the correct location: libs/node/utils/src/lib/setupShutdownHooks.ts
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the usage of @island.is/node-utils module
# Expected: Find imports of setupShutdownHooks from this module

# Search for imports of the module
rg "from ['|\"]@island\.is/node-utils['|\"]"

# Search for the implementation file
fd "setupShutdownHooks.ts" "libs/node/utils/src"

Length of output: 363

libs/node/utils/src/lib/setupShutdownHooks.ts Outdated Show resolved Hide resolved
@AndesKrrrrrrrrrrr
Copy link
Member Author

@eirikurn, hmm 🤔

Having forceful teardowns of connections this way is only marginally better than the current situation.

Pods in terminating state shouldn't receive any new requests, yet it seems like they still will according to this issue. So, we still need to ensure graceful teardown by stopping accepting new connections

@AndesKrrrrrrrrrrr
Copy link
Member Author

Copy link
Contributor

@snaerseljan snaerseljan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work 👏

libs/node/utils/README.md Show resolved Hide resolved
libs/node/utils/src/lib/setupShutdownHooks.spec.ts Outdated Show resolved Hide resolved
libs/node/utils/src/lib/setupShutdownHooks.ts Outdated Show resolved Hide resolved
libs/node/utils/src/lib/setupShutdownHooks.ts Outdated Show resolved Hide resolved
@gudjong gudjong removed the automerge Merge this PR as soon as all checks pass label Dec 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deploy-feature Deploys features to dev
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants