Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Add manifest corruption prevention with configurable delays #1551

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -263,6 +263,20 @@
}
}
}
},
"dbt.preRebuildDelay": {
"type": "number",
"default": 0,
"description": "Delay in milliseconds before rebuilding manifest after file changes. Increase this value (e.g. to 2000) if you experience manifest corruption issues.",
"minimum": 0,
"maximum": 10000
},
"dbt.manifestRebuildDebounce": {
"type": "number",
"default": 1000,
"description": "Debounce time in milliseconds for manifest rebuilds. Increase this value if you have many rapid file changes causing frequent rebuilds.",
"minimum": 500,
"maximum": 10000
}
}
}
Expand Down
52 changes: 50 additions & 2 deletions src/manifest/dbtProject.ts
Original file line number Diff line number Diff line change
Expand Up @@ -360,8 +360,56 @@ export class DBTProject implements Disposable {
this.projectRoot
}`,
);
await this.rebuildManifest();
}, this.dbtProjectIntegration.getDebounceForRebuildManifest()),

// Get configurable delays from settings with reasonable defaults
const preRebuildDelay = workspace
.getConfiguration("dbt")
.get<number>("preRebuildDelay", 0); // Default to 0ms for fast systems

const defaultDebounceTime = workspace
.getConfiguration("dbt")
.get<number>("manifestRebuildDebounce", 1000); // Default to 1s

// Send telemetry about the current delay settings
this.telemetry.sendTelemetryEvent("manifestRebuildSettings", {
preRebuildDelay: preRebuildDelay.toString(),
debounceTime: (
this.dbtProjectIntegration.getDebounceForRebuildManifest() ||
defaultDebounceTime
).toString(),
projectName: DBTProject.hashProjectRoot(this.projectRoot.fsPath),
});

const startTime = Date.now();

// Only add delay if configured (for systems experiencing issues)
if (preRebuildDelay > 0) {
await new Promise((resolve) =>
setTimeout(resolve, preRebuildDelay),
);
}

try {
await this.rebuildManifest();

// Track successful rebuilds and their timing
this.telemetry.sendTelemetryEvent("manifestRebuildSuccess", {
duration: (Date.now() - startTime).toString(),
preRebuildDelay: preRebuildDelay.toString(),
projectName: DBTProject.hashProjectRoot(this.projectRoot.fsPath),
});
} catch (error) {
// Track failed rebuilds
this.telemetry.sendTelemetryEvent("manifestRebuildError", {
duration: (Date.now() - startTime).toString(),
preRebuildDelay: preRebuildDelay.toString(),
projectName: DBTProject.hashProjectRoot(this.projectRoot.fsPath),
errorType: error instanceof Error ? error.name : "unknown",
errorMessage: error instanceof Error ? error.message : "unknown",
});
throw error; // Re-throw to maintain original error handling
}
}, this.dbtProjectIntegration.getDebounceForRebuildManifest() || defaultDebounceTime),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix undefined variable error.

The variable defaultDebounceTime is used but not defined, causing pipeline failures. Use the value retrieved from workspace configuration.

-        }, this.dbtProjectIntegration.getDebounceForRebuildManifest() || defaultDebounceTime),
+        }, this.dbtProjectIntegration.getDebounceForRebuildManifest() || workspace.getConfiguration("dbt").get<number>("manifestRebuildDebounce", 1000)),
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
}, this.dbtProjectIntegration.getDebounceForRebuildManifest() || defaultDebounceTime),
}, this.dbtProjectIntegration.getDebounceForRebuildManifest() || workspace.getConfiguration("dbt").get<number>("manifestRebuildDebounce", 1000)),
🧰 Tools
🪛 GitHub Check: test (ubuntu-latest)

[failure] 423-423:
Cannot find name 'defaultDebounceTime'.

🪛 GitHub Check: test (macos-latest)

[failure] 423-423:
Cannot find name 'defaultDebounceTime'.

🪛 GitHub Actions: Tests

[error] 423-423: Cannot find name 'defaultDebounceTime'. The variable or constant is not defined in the current scope.

),
);

Expand Down
137 changes: 128 additions & 9 deletions src/manifest/parsers/index.ts
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
import { existsSync, readFileSync } from "fs";
import { existsSync, readFileSync, statSync } from "fs";
import { provide } from "inversify-binding-decorators";
import * as path from "path";
import { Uri } from "vscode";
Expand Down Expand Up @@ -49,7 +49,7 @@
return;
}
const projectRoot = project.projectRoot;
const manifest = this.readAndParseManifest(projectRoot, targetPath);
const manifest = await this.readAndParseManifest(projectRoot, targetPath);
if (manifest === undefined) {
const event: ManifestCacheChangedEvent = {
added: [
Expand Down Expand Up @@ -187,7 +187,7 @@
return event;
}

private readAndParseManifest(projectRoot: Uri, targetPath: string) {
private async readAndParseManifest(projectRoot: Uri, targetPath: string) {
const pathParts = [targetPath];
if (!path.isAbsolute(targetPath)) {
pathParts.unshift(projectRoot.fsPath);
Expand All @@ -198,15 +198,134 @@
`Reading manifest at ${manifestLocation} for project at ${projectRoot}`,
);

// MANIFEST CORRUPTION PREVENTION
// The manifest file can become corrupted when it's read while dbt is still writing to it.
// This happens because dbt writes the manifest file in a single operation, but the file
// can be quite large (often >1MB). During this write operation, if we try to read the file,
// we may get a partially written manifest that is invalid JSON or missing critical data.

// To prevent this, we implement a progressive retry strategy:
// 1. First Attempt: Try immediate read for best performance
// 2. If that fails, use file size stability checks to ensure write is complete
// 3. Basic validation to verify manifest completeness

const maxRetries = 3;
const stabilityDelay = 1000; // 1 second between file size checks
const maxStabilityChecks = 3; // Number of times file size must remain stable

let lastAttemptError: any;
let lastFileSize: number = 0;
let stabilityCheckAttempts = 0;

// First attempt: Try immediate read for best performance
try {
if (!existsSync(manifestLocation)) {
throw new Error("Manifest file does not exist");
}
const stats = statSync(manifestLocation);
lastFileSize = stats.size;

const manifestFile = readFileSync(manifestLocation, "utf8");

Check failure

Code scanning / CodeQL

Potential file system race condition High

The file may have changed since it
was checked
.
return JSON.parse(manifestFile);
const parsed = JSON.parse(manifestFile);

if (!parsed.nodes || !parsed.sources || !parsed.macros) {
throw new Error("Manifest file appears to be incomplete");
}

// Fast path succeeded - no telemetry needed for successful first attempt
return parsed;
} catch (error) {
this.terminal.error(
"ManifestParser",
`Could not read manifest file at ${manifestLocation}, ignoring error`,
error,
);
// First attempt failed, log it and move to retry with stability checks
this.telemetry.sendTelemetryEvent("manifestReadFirstAttemptError", {
fileSize: lastFileSize.toString(),
errorType: error instanceof Error ? error.name : "unknown",
errorMessage: error instanceof Error ? error.message : "unknown",
});
lastAttemptError = error;
}

// If we get here, first attempt failed - try again with stability checks
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
// FILE SIZE STABILITY CHECK
// Only done on retry attempts after first attempt fails
let lastSize = -1;
let stableChecks = 0;
stabilityCheckAttempts = 0;

while (stableChecks < maxStabilityChecks) {
if (!existsSync(manifestLocation)) {
throw new Error("Manifest file does not exist");
}
const stats = statSync(manifestLocation);
const currentSize = stats.size;
lastFileSize = currentSize;

if (currentSize === lastSize) {
stableChecks++;
} else {
stableChecks = 0;
lastSize = currentSize;
}

stabilityCheckAttempts++;

if (stableChecks === maxStabilityChecks) {
break;
}
await new Promise((resolve) => setTimeout(resolve, stabilityDelay));
}
Comment on lines +261 to +282
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Potential infinite loop in stability check.

The stability check loop could potentially run indefinitely if the file size keeps changing. Consider adding a maximum duration or total number of checks to prevent this.

+        const maxTotalChecks = 10; // Maximum number of total checks
+        let totalChecks = 0;
         while (stableChecks < maxStabilityChecks) {
+          if (totalChecks >= maxTotalChecks) {
+            throw new Error("File size failed to stabilize within maximum attempts");
+          }
           if (!existsSync(manifestLocation)) {
             throw new Error("Manifest file does not exist");
           }
           const stats = statSync(manifestLocation);
           const currentSize = stats.size;
           lastFileSize = currentSize;

           if (currentSize === lastSize) {
             stableChecks++;
           } else {
             stableChecks = 0;
             lastSize = currentSize;
           }

           stabilityCheckAttempts++;
+          totalChecks++;

           if (stableChecks === maxStabilityChecks) {
             break;
           }
           await new Promise((resolve) => setTimeout(resolve, stabilityDelay));
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
while (stableChecks < maxStabilityChecks) {
if (!existsSync(manifestLocation)) {
throw new Error("Manifest file does not exist");
}
const stats = statSync(manifestLocation);
const currentSize = stats.size;
lastFileSize = currentSize;
if (currentSize === lastSize) {
stableChecks++;
} else {
stableChecks = 0;
lastSize = currentSize;
}
stabilityCheckAttempts++;
if (stableChecks === maxStabilityChecks) {
break;
}
await new Promise((resolve) => setTimeout(resolve, stabilityDelay));
}
const maxTotalChecks = 10; // Maximum number of total checks
let totalChecks = 0;
while (stableChecks < maxStabilityChecks) {
if (totalChecks >= maxTotalChecks) {
throw new Error("File size failed to stabilize within maximum attempts");
}
if (!existsSync(manifestLocation)) {
throw new Error("Manifest file does not exist");
}
const stats = statSync(manifestLocation);
const currentSize = stats.size;
lastFileSize = currentSize;
if (currentSize === lastSize) {
stableChecks++;
} else {
stableChecks = 0;
lastSize = currentSize;
}
stabilityCheckAttempts++;
totalChecks++;
if (stableChecks === maxStabilityChecks) {
break;
}
await new Promise((resolve) => setTimeout(resolve, stabilityDelay));
}


const manifestFile = readFileSync(manifestLocation, "utf8");
const parsed = JSON.parse(manifestFile);

if (!parsed.nodes || !parsed.sources || !parsed.macros) {
throw new Error("Manifest file appears to be incomplete");
}

// Log successful retry with stability checks
this.telemetry.sendTelemetryEvent("manifestReadRetrySuccess", {
retryAttempt: attempt.toString(),
fileSize: lastFileSize.toString(),
stabilityChecks: stabilityCheckAttempts.toString(),
});

return parsed;
} catch (error) {
lastAttemptError = error;
this.terminal.error(
"ManifestParser",
`Attempt ${attempt + 1}/${maxRetries} failed to read manifest file at ${manifestLocation}`,
error,
);

this.telemetry.sendTelemetryEvent("manifestReadError", {
attempt: attempt.toString(),
fileSize: lastFileSize.toString(),
errorType: error instanceof Error ? error.name : "unknown",
errorMessage: error instanceof Error ? error.message : "unknown",
stabilityChecks: stabilityCheckAttempts.toString(),
});

if (attempt === maxRetries - 1) {
this.telemetry.sendTelemetryEvent("manifestReadFinalFailure", {
totalAttempts: (maxRetries + 1).toString(), // +1 for first attempt
finalFileSize: lastFileSize.toString(),
finalErrorType:
lastAttemptError instanceof Error
? lastAttemptError.name
: "unknown",
finalErrorMessage:
lastAttemptError instanceof Error
? lastAttemptError.message
: "unknown",
totalStabilityChecks: stabilityCheckAttempts.toString(),
});

return undefined;
}
await new Promise((resolve) => setTimeout(resolve, 2000));
}
}
}
}
Expand Down
Loading