Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔄 Fetch with Retry for HTML build #1793

Merged
merged 1 commit into from
Jan 20, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/late-phones-jam.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"myst-cli": patch
---

Retry html pages build and limit initial outgoing connections.
44 changes: 25 additions & 19 deletions packages/myst-cli/src/build/html/index.ts
Original file line number Diff line number Diff line change
@@ -9,6 +9,10 @@ import type { StartOptions } from '../site/start.js';
import { startServer } from '../site/start.js';
import { getSiteTemplate } from '../site/template.js';
import { slugToUrl } from 'myst-common';
import pLimit from 'p-limit';
import { fetchWithRetry } from '../../utils/fetchWithRetry.js';

const limitConnections = pLimit(5);

export async function currentSiteRoutes(
session: ISession,
@@ -141,25 +145,27 @@ export async function buildHtml(session: ISession, opts: StartOptions) {

// Fetch all HTML pages and assets by the template
await Promise.all(
routes.map(async (route) => {
const resp = await session.fetch(route.url);
if (!resp.ok) {
session.log.error(`Error fetching ${route.url}`);
return;
}
if (route.binary && resp.body) {
await new Promise<void>((resolve) => {
const filename = path.join(htmlDir, route.path);
if (!fs.existsSync(filename)) fs.mkdirSync(path.dirname(filename), { recursive: true });
const fileWriteStream = fs.createWriteStream(filename);
resp.body!.pipe(fileWriteStream);
fileWriteStream.on('finish', resolve);
});
} else {
const content = await resp.text();
writeFileToFolder(path.join(htmlDir, route.path), content);
}
}),
routes.map(async (route) =>
limitConnections(async () => {
const resp = await fetchWithRetry(session, route.url);
Comment on lines +149 to +150
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Main changes here, limiting connections to 5 and having a retry with exponential backoff.

if (!resp.ok) {
session.log.error(`Error fetching ${route.url}`);
return;
}
if (route.binary && resp.body) {
await new Promise<void>((resolve) => {
const filename = path.join(htmlDir, route.path);
if (!fs.existsSync(filename)) fs.mkdirSync(path.dirname(filename), { recursive: true });
const fileWriteStream = fs.createWriteStream(filename);
resp.body!.pipe(fileWriteStream);
fileWriteStream.on('finish', resolve);
});
} else {
const content = await resp.text();
writeFileToFolder(path.join(htmlDir, route.path), content);
}
}),
),
);
appServer.stop();

45 changes: 45 additions & 0 deletions packages/myst-cli/src/utils/fetchWithRetry.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
import type { RequestInfo, RequestInit, Response } from 'node-fetch';
import type { ISession } from '../session/types.js';

/**
* Recursively fetch a URL with retry and exponential backoff.
*/
export async function fetchWithRetry(
session: Pick<ISession, 'log' | 'fetch'>,
/** The URL to fetch. */
url: URL | RequestInfo,
/** Options to pass to fetch (e.g., headers, method). */
options?: RequestInit,
/** How many times total to attempt the fetch. */
maxRetries = 3,
/** The current attempt number. */
attempt = 1,
/** The current backoff duration in milliseconds. */
backoff = 250,
): Promise<Response> {
try {
const resp = await session.fetch(url, options);
if (resp.ok) {
// If it's a 2xx response, we consider it a success and return it
return resp;
} else {
// For non-2xx, we treat it as a failure that triggers a retry
session.log.warn(
`Fetch of ${url} failed with HTTP status ${resp.status} for URL: ${url} (Attempt #${attempt})`,
);
}
} catch (error) {
// This covers network failures and other errors that cause fetch to reject
session.log.warn(`Fetch of ${url} threw an error (Attempt #${attempt})`, error);
}

// If we haven't reached the max retries, wait and recurse
if (attempt < maxRetries) {
session.log.debug(`Waiting ${backoff}ms before retry #${attempt + 1}...`);
await new Promise((resolve) => setTimeout(resolve, backoff));
return fetchWithRetry(session, url, options, maxRetries, attempt + 1, backoff * 2);
}

// If we made it here, all retries have been exhausted
throw new Error(`Failed to fetch ${url} after ${maxRetries} attempts.`);
}
1 change: 1 addition & 0 deletions packages/myst-cli/src/utils/index.ts
Original file line number Diff line number Diff line change
@@ -16,6 +16,7 @@ export * from './toc.js';
export * from './uniqueArray.js';
export * from './github.js';
export * from './whiteLabelling.js';
export * from './fetchWithRetry.js';

export * as ffmpeg from './ffmpeg.js';
export * as imagemagick from './imagemagick.js';

Unchanged files with check annotations Beta

Check warning on line 7 in packages/myst-transforms/src/links/github.ts

GitHub Actions / lint

'TRANSFORM_SOURCE' is assigned a value but never used

Check warning on line 106 in packages/myst-transforms/src/links/github.ts

GitHub Actions / lint

'file' is defined but never used
constructor(opts?: { logger?: Logger }) {
this.log = opts?.logger ?? chalkLogger(LogLevel.debug, process.cwd());
}
fetch(url: URL | RequestInfo, init?: RequestInit | undefined): Promise<Response> {

Check warning on line 10 in packages/myst-cli-utils/src/session.ts

GitHub Actions / lint

'url' is defined but never used

Check warning on line 10 in packages/myst-cli-utils/src/session.ts

GitHub Actions / lint

'init' is defined but never used
throw new Error('fetch not implemented on session');
}
}
import type { DirectiveData, DirectiveSpec, GenericNode } from 'myst-common';
import { RuleId, fileWarn, normalizeLabel } from 'myst-common';

Check warning on line 2 in packages/myst-directives/src/include.ts

GitHub Actions / lint

'normalizeLabel' is defined but never used
import { CODE_DIRECTIVE_OPTIONS, getCodeBlockOptions } from './code.js';
import type { Include } from 'myst-spec-ext';
import type { VFile } from 'vfile';
import type { VFile } from 'vfile';
import type { GenericNode } from 'myst-common';
import { fileError, fileWarn, toText, getMetadataTags } from 'myst-common';
import { captionHandler, containerHandler, getDefaultCaptionSupplement } from './container.js';

Check warning on line 6 in packages/myst-to-typst/src/index.ts

GitHub Actions / lint

'getDefaultCaptionSupplement' is defined but never used
import type {
Handler,
ITypstSerializer,
import type { JSONSchema7Definition } from 'json-schema';
const LicenseSchema: JSONSchema7Definition = {

Check warning on line 3 in packages/myst-frontmatter/src/licenses/schema.ts

GitHub Actions / lint

'LicenseSchema' is assigned a value but never used
$id: '',
description:
'The license information, which can be either a string or an object with `code` and `content`.',
import { cleanOutput } from './utils/cleanOutput.js';
import { getFileContent } from './utils/getFileContent.js';
import { resolveFrontmatterParts } from '../utils/resolveFrontmatterParts.js';
import { parseMyst } from '../process/myst.js';

Check warning on line 18 in packages/myst-cli/src/build/cff.ts

GitHub Actions / lint

'parseMyst' is defined but never used
function exportOptionsToCFF(exportOptions: ExportWithOutput): CFF {
// Handle overlap of key "format" between CFF and export
mdast: GenericParent,
references: References,
frontmatter: PageFrontmatter,
templateYml: TemplateYml | null,

Check warning on line 69 in packages/myst-cli/src/build/typst.ts

GitHub Actions / lint

'templateYml' is defined but never used
printGlossaries: boolean,

Check warning on line 70 in packages/myst-cli/src/build/typst.ts

GitHub Actions / lint

'printGlossaries' is defined but never used
) {
const pipe = unified().use(mystToTypst, {
math: frontmatter?.math,