Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails on MacOS for large site because max proc limit is exceeded #10348

Open
7 tasks done
t1m0thyj opened this issue Jul 29, 2024 · 7 comments · May be fixed by #10358
Open
7 tasks done

Build fails on MacOS for large site because max proc limit is exceeded #10348

t1m0thyj opened this issue Jul 29, 2024 · 7 comments · May be fixed by #10358
Labels
bug An error in the Docusaurus core causing instability or issues with its execution

Comments

@t1m0thyj
Copy link

Have you read the Contributing Guidelines on issues?

Prerequisites

  • I'm using the latest version of Docusaurus.
  • I have tried the npm run clear or yarn clear command.
  • I have tried rm -rf node_modules yarn.lock package-lock.json and re-installing packages.
  • I have tried creating a repro with https://new.docusaurus.io.
  • I have read the console error message carefully (if applicable).

Description

After updating from Docusaurus v3.1 to v3.4, building a large site fails with the following error on MacOS:

[ERROR] Error: Unable to build website for locale en.
    at tryToBuildLocale (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/commands/build.js:54:19)
    at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/commands/build.js:65:9
    at async mapAsyncSequential (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/jsUtils.js:20:24)
    at async Command.build (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/commands/build.js:63:5) {
  [cause]: Error: Can't process doc metadata for doc at path path=/Users/timothy/Projects/zowe/docs-site/versioned_docs/version-v2.11.x/user-guide/cli-using-formatting-environment-variables.md in version name=v2.11.x
      at processDocMetadata (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/docs.js:146:15)
      at async Promise.all (index 85)
      ... 4 lines matching cause stack trace ...
      at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/plugins/plugins.js:38:23
      at async Promise.all (index 0)
      at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/plugins/plugins.js:139:25
      at async loadSite (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/site.js:135:45) {
    [cause]: Error: An error occurred when trying to get the last update date
        at getGitLastUpdate (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/lastUpdateUtils.js:43:19)
        at async readLastUpdateData (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/lastUpdateUtils.js:80:36)
        at async doProcessDocMetadata (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/docs.js:48:24)
        at async processDocMetadata (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/docs.js:143:16)
        at async Promise.all (index 85)
        at async doLoadVersion (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/index.js:121:34)
        at async loadVersion (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/index.js:162:28)
        at async Promise.all (index 6)
        at async Object.loadContent (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/plugin-content-docs/lib/index.js:170:33)
        at async /Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/core/lib/server/plugins/plugins.js:38:23 {
      [cause]: Error [ShellJSInternalError]: spawn EBADF
          at ChildProcess.spawn (node:internal/child_process:421:11)
          at spawn (node:child_process:761:9)
          at Object.execFile (node:child_process:351:17)
          at Object.exec (node:child_process:234:25)
          at execAsync (/Users/timothy/Projects/zowe/docs-site/node_modules/shelljs/src/exec.js:136:17)
          at Object._exec (/Users/timothy/Projects/zowe/docs-site/node_modules/shelljs/src/exec.js:221:12)
          at Object.exec (/Users/timothy/Projects/zowe/docs-site/node_modules/shelljs/src/common.js:335:23)
          at result (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/gitUtils.js:46:27)
          at new Promise (<anonymous>)
          at getFileCommitDate (/Users/timothy/Projects/zowe/docs-site/node_modules/@docusaurus/utils/lib/gitUtils.js:45:26) {
        errno: -9,
        code: 'EBADF',
        syscall: 'spawn'
      }
    }
  }
}

Reproducible demo

No response

Steps to reproduce

  1. Upgrade to Docusaurus v3.2 or later (e.g. https://github.com/zowe/docs-site/pull/3785/files#diff-7ae45ad102eab3b6d7e7896acd08c427a9b25b346470d7bc6507b6481575d519)
  2. Run docusaurus build

Expected behavior

The build succeeds

Actual behavior

There is an EBADF error like the one above. This works on other OS's like Ubuntu, so it seems to be related to the max number of processes supported by MacOS.

I was able to work around the issue by replacing the following line to limit the number of concurrent git processes to 100:

return Promise.all(docFiles.map(processVersionDoc));

  const results = [];
  while (docFiles.length) {
    results.push(...await Promise.all(docFiles.splice(0, 100).map(processVersionDoc)));
  }
  return results;

I'm willing to open a PR if the fix is simple, but not sure what the preferred solution would be - is a hardcoded limit like 100 ok, or should it be configurable in the Docusaurus config?

Your environment

Self-service

  • I'd be willing to fix this bug myself.
@t1m0thyj t1m0thyj added bug An error in the Docusaurus core causing instability or issues with its execution status: needs triage This issue has not been triaged by maintainers labels Jul 29, 2024
@Josh-Cena
Copy link
Collaborator

I think a hardcoded limit is ok, and at best we can have an env variable to customize.

@slorber
Copy link
Collaborator

slorber commented Jul 30, 2024

Left my initial review here: #10354 (review)

To be honest, although the proposed solution might fix the problem, and we might still want to limit git concurrency, I believe it only hides the real bug here: shelljs (unmaintained deps) probably doesn't handle file descriptors well under concurrent access?

I'd like to explore switching to alternatives first, before introducing IO queueing:

@Josh-Cena
Copy link
Collaborator

I agree to that, let's get rid of shelljs completely for execa?

However I don't know why file descriptors are related. The problem here is hitting the process limit, not the FD limit, no?

@slorber
Copy link
Collaborator

slorber commented Aug 1, 2024

I'm not 100% sure but it seems "EBADF" means "Error, bad file descriptor" in Node.js

I propose that we first remove shelljs and see if the problem disappears by testing upgrading the zowe site to a canary?

@t1m0thyj
Copy link
Author

t1m0thyj commented Aug 1, 2024

I'm not 100% sure but it seems "EBADF" means "Error, bad file descriptor" in Node.js

I propose that we first remove shelljs and see if the problem disappears by testing upgrading the zowe site to a canary?

If shelljs is unmaintained then I'm all for getting rid of it. Although the error code does seem related to file descriptors, I don't think the max file descriptor limit is being hit because I tried increasing it with ulimit.

Regardless of what dependency is being used to spawn processes, IMO it's not good practice to launch hundreds or potentially thousands of processes at once, without a promise queue in place to limit the number of concurrent processes.

@slorber
Copy link
Collaborator

slorber commented Aug 1, 2024

I'm not 100% sure but it seems "EBADF" means "Error, bad file descriptor" in Node.js
I propose that we first remove shelljs and see if the problem disappears by testing upgrading the zowe site to a canary?

If shelljs is unmaintained then I'm all for getting rid of it. Although the error code does seem related to file descriptors, I don't think the max file descriptor limit is being hit because I tried increasing it with ulimit.

Thanks, was about to ask.

Regardless of what dependency is being used to spawn processes, IMO it's not good practice to launch hundreds or potentially thousands of processes at once, without a promise queue in place to limit the number of concurrent processes.

Agree, but this is a general problem we have, not just limited to Git commands but to all IOs in general.


I tried building your branch locally zowe/docs-site#3785

I got many warnings (broken links, admonitions) but was able to build without such EBADF error (I'm on MacOS M3 Sonoma)

Only you will be able to confirm it moving to Execa fixes it. You can try to apply this change locally:

https://github.com/facebook/docusaurus/pull/10358/files#diff-cb75564637c0cca6a5c3d3eceb846ae064def7c90424de1df0bd110c3fc23b14R133

@t1m0thyj
Copy link
Author

t1m0thyj commented Aug 2, 2024

I tried building your branch locally zowe/docs-site#3785

I got many warnings (broken links, admonitions) but was able to build without such EBADF error (I'm on MacOS M3 Sonoma)

Only you will be able to confirm it moving to Execa fixes it. You can try to apply this change locally:

https://github.com/facebook/docusaurus/pull/10358/files#diff-cb75564637c0cca6a5c3d3eceb846ae064def7c90424de1df0bd110c3fc23b14R133

We archived some old doc versions to reduce the number of files in the repo and work around the issue. I'll test the changes next week on a branch that still has the old files.

Thanks for the reminder about broken links, there is WIP to fix them 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An error in the Docusaurus core causing instability or issues with its execution
Projects
None yet
4 participants