Skip to content

Conversation

@kennethkalmer
Copy link
Member

@kennethkalmer kennethkalmer commented Dec 8, 2025

Problem

We've been experiencing intermittent CI failures where config/nginx-redirects.conf is sometimes empty after successful Gatsby builds. This causes smoke test failures and is a production risk - redirects may silently fail without warning.

Key finding: Builds complete successfully with no errors in logs, but the redirects file ends up empty.

Solution

This PR implements a multi-layered approach to detect and prevent empty redirect files:

1. Comprehensive Instrumentation

Using Gatsby's Reporter Infrastructure (consistent with existing codebase patterns):

  • data/createPages/writeRedirectToConfigFile.ts

    • Now accepts optional reporter parameter
    • Logs file initialization with timestamp and unique ID
    • Tracks every redirect write with progress logging (every 10th)
    • Detects re-initialization attempts with warnings
    • Error handling with try-catch around all file operations
  • data/createPages/index.ts

    • Logs GraphQL query results showing pages with redirect_from
    • Reports hash fragment filtering (redirects skipped for nginx)
    • Tracks redirect counts through the build process

2. Post-Write Validation (createPages)

After Promise.all() completes:

  • Validates redirect file exists and has content
  • Compares expected redirect count vs actual file lines
  • Uses reporter.panicOnBuild() to fail builds immediately if empty
  • Provides detailed error context for debugging

3. Build Validation Hook (onPostBuild)

Safety net before artifacts are published:

  • Validates file existence, size, and content
  • Checks redirect format matches nginx syntax
  • Validates minimum redirect threshold (50+)
  • Uses reporter.panicOnBuild() for critical failures

4. Enhanced CI Validation

.circleci/config.yml improvements:

  • Checks file exists and is not empty (> 0 bytes)
  • Counts redirects using grep -c ';$'
  • Requires minimum of 50 redirects
  • Shows file details if validation fails
  • Clear error messages with context

5. Benefits of Using Gatsby Reporter

All logging now uses Gatsby's reporter instead of console.log:

  • ✅ Structured, colored output consistent with Gatsby's build process
  • ✅ Proper integration with Gatsby's build status
  • reporter.panicOnBuild() properly fails builds
  • ✅ Better error handling with stack traces
  • ✅ Matches existing patterns in compressAssets.ts and llmstxt.ts

Expected Outcomes

After merging:

  1. Full visibility: Comprehensive logging at every stage of redirect generation
  2. Early detection: Build fails immediately if redirects are empty (not during smoke tests)
  3. Root cause identification: Logs will reveal exactly where/why the issue occurs
  4. Production safety: Impossible for empty redirect files to reach production silently
  5. Professional output: Consistent, structured logging throughout

Testing

  • Build should complete with new [REDIRECTS] log messages visible
  • If redirects are empty, build will fail at createPages with clear error
  • CI validation will catch any issues before smoke tests run
  • Minimum redirect count can be adjusted based on actual data

Deployment Notes

After deployment, monitor the first few builds to:

  • Confirm redirect counts are as expected (adjust MIN_EXPECTED_REDIRECTS if needed)
  • Review new logging output for any anomalies
  • If the empty file issue occurs again, logs will now show the root cause

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Strengthened redirect configuration validation with comprehensive integrity checks including file existence, size verification, and content validation.
    • Enhanced error reporting and diagnostics during builds to provide clearer feedback when configuration issues are detected.
  • Refactor

    • Improved monitoring and logging infrastructure for redirect configuration tracking during the build process.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 8, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

The changes implement comprehensive redirect validation across the Gatsby build pipeline. CircleCI config enhances validation with detailed checks on redirect file integrity. The createPages module adds reporter-based redirect monitoring and post-build validation. The writeRedirectToConfigFile module introduces initialization guards and redirect counting. The onPostBuild module adds runtime validation and early failure mechanisms.

Changes

Cohort / File(s) Summary
Build Pipeline Validation
data/onPostBuild/index.ts
Adds validateRedirects function to check nginx config file existence, size, line count, and format; updates onPostBuild signature to accept reporter parameter; imports Reporter type and fs module; runs validation before existing hooks with fail-fast error reporting.
Redirect Creation & Monitoring
data/createPages/index.ts
Updates createPages signature to accept reporter parameter; initializes redirect writer with reporter; adds runtime logging for redirect counts across documents, API references, and MDX; implements post-write validation comparing created redirects against nginx-redirects.conf; imports new getRedirectCount utility.
Redirect Writer & Counting
data/createPages/writeRedirectToConfigFile.ts
Introduces one-time initialization logic with InitID and timestamp tracking; converts writer to factory accepting optional Reporter; adds persistent redirectCount tracking with increment on each write; exports new public utilities getRedirectCount() and resetRedirectCount(); adds re-initialization guards.
CI Validation Enforcement
.circleci/config.yml
Replaces simple file existence check in test-nginx job with detailed validation: file existence, non-empty content, file size logging, redirect count (lines ending with semicolon), and enforcement of minimum 50 redirects; provides descriptive errors and early failure conditions.

Sequence Diagram

sequenceDiagram
    participant GatsbyBuild as Gatsby Build
    participant CreatePages as createPages
    participant WriteRedirect as writeRedirectToConfigFile
    participant FileSystem as File System
    participant Validation as Validation
    participant Reporter as Reporter

    GatsbyBuild->>CreatePages: invoke with reporter
    CreatePages->>WriteRedirect: initialize with reporter
    WriteRedirect->>WriteRedirect: check init flag & log InitID
    WriteRedirect->>FileSystem: clear nginx-redirects.conf
    CreatePages->>CreatePages: iterate pages & create redirects
    CreatePages->>WriteRedirect: writeRedirect(...) per page
    WriteRedirect->>FileSystem: append redirect to file
    WriteRedirect->>WriteRedirect: increment redirectCount
    WriteRedirect->>Reporter: log redirect (if reporter provided)
    CreatePages->>Validation: post-write validation
    Validation->>FileSystem: read nginx-redirects.conf
    Validation->>Validation: count lines & compare with getRedirectCount()
    alt Mismatch or File Issues
        Validation->>Reporter: emit error/warning
    else Success
        Validation->>Reporter: log validation success
    end
    Reporter-->>GatsbyBuild: return result
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Initialization guard logic in writeRedirectToConfigFile: Review the one-time initialization pattern and re-initialization prevention to ensure thread-safety and correctness across multiple page creation cycles.
  • Reporter parameter threading: Verify that the reporter parameter is correctly passed and used consistently across createPages, writeRedirectToConfigFile, and onPostBuild without breaking existing callers.
  • Validation logic consistency: Ensure the validation checks in CircleCI config, onPostBuild, and createPages are aligned (e.g., MIN_REDIRECTS threshold, line counting logic).
  • Public API exports: Confirm that new exports (getRedirectCount, resetRedirectCount) are properly typed and don't introduce unintended side effects or state leaks.

Poem

🐇 A rabbit's refrain on redirects refined:

Nginx flows now with validation divine,
Counts and guards on each line,
Creating pages, writes track and align,
Posts build with checks so fine—
No broken paths in our warren's design! 🎉

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly describes the main change: adding comprehensive redirect file validation and instrumentation across multiple components.
Linked Issues check ✅ Passed The PR addresses the linked issue WEB-4839 (maintenance chores ticket) by implementing redirect file validation infrastructure across CI, build hooks, and Gatsby plugins.
Out of Scope Changes check ✅ Passed All changes focus on redirect file validation and instrumentation; no unrelated modifications detected.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

Comment @coderabbitai help to get the list of available commands and usage tips.

@kennethkalmer kennethkalmer changed the title fix: add comprehensive redirect file validation and instrumentation (WEB-4839) [WEB-4839] Add comprehensive redirect file validation and instrumentation Dec 8, 2025
@kennethkalmer kennethkalmer force-pushed the WEB-4839-redirects-build-issue branch from f68b910 to c739d3f Compare December 8, 2025 23:05
@kennethkalmer kennethkalmer marked this pull request as ready for review December 8, 2025 23:13
@kennethkalmer
Copy link
Member Author

@coderabbitai full review

@coderabbitai
Copy link

coderabbitai bot commented Dec 8, 2025

✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
data/onPostBuild/index.ts (1)

46-58: Redirect format regex may produce false positives for external redirects.

The regex ^\/[^\s]+ \/[^\s]+;$ requires both the source and destination paths to start with /. If there are any external URL redirects (e.g., https://...) or paths with query parameters, they would be flagged as invalid even if they're intentional.

Since this is a warning (not a panic), the impact is limited to log noise, but worth noting.

Consider expanding the regex to also allow external URLs if they're expected:

-    .filter((line) => line.length > 0 && !line.match(/^\/[^\s]+ \/[^\s]+;$/));
+    .filter((line) => line.length > 0 && !line.match(/^\/[^\s]+ (\/[^\s]+|https?:\/\/[^\s]+);$/));
data/createPages/writeRedirectToConfigFile.ts (2)

6-7: initTimestamp captured at module load, not initialization time.

initTimestamp is set when the module is first loaded, not when writeRedirectToConfigFile is actually called. If there's a delay between module import and initialization, the logged timestamp won't reflect the actual initialization time.

Move timestamp generation inside the function:

-const initTimestamp = new Date().toISOString();
+let initTimestamp: string;

 export const writeRedirectToConfigFile = (filePath: string, reporter?: Reporter) => {
   // Detect re-initialization
   if (isInitialized) {
     // ...
   }

+  initTimestamp = new Date().toISOString();
   reporter?.info(`[REDIRECTS] Initializing redirect file at ${filePath}`);

44-49: Consider documenting resetRedirectCount usage scope.

This function resets module state and is likely intended for testing. Consider adding a JSDoc comment to clarify its purpose and prevent accidental usage in production code.

+/**
+ * Resets redirect count and initialization state.
+ * Intended for testing purposes only.
+ */
 export const resetRedirectCount = (): void => {
   redirectCount = 0;
   isInitialized = false;
 };
data/createPages/index.ts (1)

222-239: Clarify that count represents pages, not individual redirects.

The logging shows "pages with redirect_from" which is accurate. However, a page can have multiple redirect_from entries, so the total pages count won't equal the final redirect count. This is fine but worth noting in the logs for clarity.

-  reporter.info(`[REDIRECTS] - Total pages with redirects: ${documentPagesWithRedirects + apiPagesWithRedirects + mdxPagesWithRedirects}`);
+  reporter.info(`[REDIRECTS] - Total pages with redirects: ${documentPagesWithRedirects + apiPagesWithRedirects + mdxPagesWithRedirects} (note: pages may have multiple redirect_from entries)`);
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Jira integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 6b929bc and c739d3f.

📒 Files selected for processing (4)
  • .circleci/config.yml (1 hunks)
  • data/createPages/index.ts (8 hunks)
  • data/createPages/writeRedirectToConfigFile.ts (1 hunks)
  • data/onPostBuild/index.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
data/onPostBuild/index.ts (2)
data/onPostBuild/llmstxt.ts (1)
  • onPostBuild (255-507)
data/onPostBuild/compressAssets.ts (1)
  • onPostBuild (21-45)
data/createPages/index.ts (1)
data/createPages/writeRedirectToConfigFile.ts (2)
  • writeRedirectToConfigFile (9-42)
  • getRedirectCount (44-44)
🔇 Additional comments (6)
.circleci/config.yml (1)

126-161: LGTM! Comprehensive CI validation for redirect file integrity.

The validation logic is well-structured with clear error messages and progressive checks. The || echo "0" fallback for grep -c correctly handles the case where no matches are found (which would otherwise return exit code 1).

data/onPostBuild/index.ts (1)

63-69: LGTM! Proper fail-fast pattern with validateRedirects.

The flow correctly validates redirects first before proceeding with other post-build tasks. The pattern of destructuring reporter and spreading to downstream functions matches the existing patterns in llmstxt and compressAssets.

data/createPages/writeRedirectToConfigFile.ts (1)

11-19: Re-initialization guard is a good defensive measure.

This correctly prevents the file from being cleared multiple times during a build, which was likely a root cause of the empty redirect file issue. The stack trace logging will help diagnose any unexpected re-initialization scenarios.

data/createPages/index.ts (3)

118-119: LGTM! Proper initialization with reporter passed to redirect writer.

This ensures all redirect-related logging goes through Gatsby's reporter system for consistent build output.


393-439: Solid post-write validation with clear error context.

The validation provides immediate feedback during createPages rather than waiting until onPostBuild, enabling faster failure detection. The detailed error messages at lines 420-423 provide actionable debugging guidance.

Note: This validation is intentionally redundant with onPostBuild validation for defense in depth.


402-408: return statement after panicOnBuild is correct and necessary.

reporter.panicOnBuild() marks the build as failed but does not throw an exception; it only stops the Gatsby build process. The return statement is required to prevent further code execution, as seen in the validation checks at lines 402–408 and 416–430 where additional file operations would otherwise continue. This pattern is consistent with the codebase approach in llmstxt.ts, where both panicOnBuild() and explicit control flow interruption (throw) are used together for the same reason.

@kennethkalmer kennethkalmer self-assigned this Dec 8, 2025
Copy link
Member

@jamiehenson jamiehenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this disclaimer was presented in advance and this is meant to be a quick thing but this is very botty.

Given this is not product facing I'm not standing with both feet on this hill but I would invite you to glance back through and remove anything that isn't relevant, and determine whether any logic could be shared between the different SSR files as there's a lot added to them and a lot of additions tracing similar paths.

If you disagree with my stance I can re-review this with a different perspective but I'm treating this as I see it.

# Get file size
FILE_SIZE=$(wc -c < config/nginx-redirects.conf)
echo "✓ File size: ${FILE_SIZE} bytes"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care what the file size is?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need it to be non-zero. Currently on the failing tests it is actually zero, and the original tests pass because they simply tested for the existence of the file with the -f flag. That said, I'll rework the test for clarity

MIN_REDIRECTS=50
if [ "$REDIRECT_COUNT" -lt "$MIN_REDIRECTS" ]; then
echo "ERROR: Expected at least ${MIN_REDIRECTS} redirects, found ${REDIRECT_COUNT}"
echo "First 10 lines of file:"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this relevant? If we have 20 redirects and expected 50, what does this tell us?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing, so to the bin it went

]);

// Post-write validation: Verify redirects were actually written to file
reporter.info(`[REDIRECTS] Promise.all completed, validating redirect file...`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'Promise.all completed' could probably be reworded here

`);

// Log redirect data availability for investigation
const documentPagesWithRedirects = documentResult.data?.allFileHtml.edges.filter(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.length doesn't return undefined or null after a filter, so I'm not sure what this nullish coalescing achieves. If you're concerned about data availability the optional chaining would be better served before the .filter instead

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me a bit to parse this out but I've made the tweak, thanks for the suggestion!

const redirectLines = redirectFileContent.trim().split('\n').filter((line) => line.length > 0);

reporter.info(`[REDIRECTS] Redirect lines in file: ${redirectLines.length}`);
reporter.info(`[REDIRECTS] File size: ${fs.statSync(redirectFilePath).size} bytes`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relevant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the obsession with file size came from my initial prompt when I stated "the file is empty", removed this along with other size checks

@kennethkalmer kennethkalmer force-pushed the WEB-4839-redirects-build-issue branch from c739d3f to 7fb4eeb Compare December 9, 2025 23:13
@kennethkalmer
Copy link
Member Author

Given this is not product facing I'm not standing with both feet on this hill but I would invite you to glance back through and remove anything that isn't relevant, and determine whether any logic could be shared between the different SSR files as there's a lot added to them and a lot of additions tracing similar paths.

@jamiehenson I took your concerns to heart, thank you for pulling me up. After manually cleaning up several of the small issues I added another separate fixup that a) consolidates shared responsibilities, and b) trims back the fat on excessive feedback.

With this in place I can hopefully figure out what on earth is blocking #2992 (and other PRs)

This commit addresses WEB-4839 by implementing a multi-layered approach to
detect and prevent empty redirect files in CI builds.

Key changes:
1. Added comprehensive logging using Gatsby's reporter infrastructure
   - Track redirect initialization, write attempts, and counts
   - Log GraphQL query results showing pages with redirect_from
   - Report hash fragment filtering
   - Re-initialization detection with warnings

2. Post-write validation in createPages
   - Validates redirect file after Promise.all completes
   - Compares expected vs actual redirect counts
   - Uses reporter.panicOnBuild() to fail builds immediately if empty

3. Build validation hook in onPostBuild
   - Safety net before build artifacts are published
   - Validates file existence, size, content, and format
   - Checks minimum redirect threshold (50+)

4. Enhanced CI validation
   - Comprehensive bash checks for file content
   - Counts redirects and validates minimum threshold
   - Shows file details if validation fails

5. Error handling improvements
   - Try-catch around all file operations
   - Proper error reporting with context
   - Prevents silent failures

All logging now uses Gatsby's reporter for structured, colored output
consistent with the rest of the codebase. This makes it impossible for
empty redirect files to reach production silently.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@kennethkalmer kennethkalmer force-pushed the WEB-4839-redirects-build-issue branch from 7fb4eeb to 287357c Compare December 11, 2025 09:39
@kennethkalmer kennethkalmer merged commit 739f937 into main Dec 11, 2025
7 checks passed
@kennethkalmer kennethkalmer deleted the WEB-4839-redirects-build-issue branch December 11, 2025 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants