Skip to content

Commit

Permalink
feat: worker pool that can execute tasks on all workers (#32120)
Browse files Browse the repository at this point in the history
* feat: create gatsby-worker package

* chore(gatsby): convert ssr and PQR workerpool to gatsby-worker

* chore(gatsby): convert dev-ssr to gatsby-worker

* test(gatsby-cli): one test at a time

* some code shuffling to avoid some error throwing when importing child (preperation for messaging API)

* drop commented out console.log ( from review comments )

* port debug|inspect cli flag handling from jest-worker

* drop skipLibCheck from tsconfig

* address Vlad's comment

* add some jsdocs

* mention why gatsby-worker only expose functions

* document isWorker export

* add comments about usage of JEST_WORKER_ID which is used by tests, even tho we ourselves will be using gatsby-worker
  • Loading branch information
pieh authored Jun 29, 2021
1 parent 7c343ff commit a82f6db
Show file tree
Hide file tree
Showing 33 changed files with 1,601 additions and 135 deletions.
2 changes: 1 addition & 1 deletion integration-tests/gatsby-cli/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
},
"license": "MIT",
"scripts": {
"test": "jest"
"test": "jest -w 1"
},
"devDependencies": {
"babel-jest": "^24.0.0",
Expand Down
3 changes: 3 additions & 0 deletions packages/gatsby-worker/.babelrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"presets": [["babel-preset-gatsby-package"]]
}
2 changes: 2 additions & 0 deletions packages/gatsby-worker/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/node_modules
/dist
36 changes: 36 additions & 0 deletions packages/gatsby-worker/.npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Logs
logs
*.log

# Runtime data
pids
*.pid
*.seed

# Directory for instrumented libs generated by jscoverage/JSCover
lib-cov

# Coverage directory used by tools like istanbul
coverage

# Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files)
.grunt

# node-waf configuration
.lock-wscript

# Compiled binary addons (http://nodejs.org/api/addons.html)
build/Release

# Dependency directory
# https://www.npmjs.org/doc/misc/npm-faq.html#should-i-check-my-node_modules-folder-into-git
node_modules
*.un~
yarn.lock
src
flow-typed
coverage
decls
examples
.babelrc
tsconfig.json
125 changes: 125 additions & 0 deletions packages/gatsby-worker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# gatsby-worker

Utility to execute tasks in forked processes. Highly inspired by [`jest-worker`](https://www.npmjs.com/package/jest-worker).

## Example

File `worker.ts`:

```ts
export async function heavyTask(param: string): Promise<string> {
// using workers is ideal for CPU intensive tasks
return await heavyProcessing(param)
}

export async function setupStep(param: string): Promise<void> {
await heavySetup(param)
}
```

File `parent.ts`

```ts
import { WorkerPool } from "gatsby-worker"

const workerPool = new WorkerPool<typeof import("./worker")>(
require.resolve(`./worker`),
{
numWorkers: 5,
env: {
CUSTOM_ENV_VAR_TO_SET_IN_WORKER: `foo`,
},
}
)

// queue a task on all workers
const arrayOfPromises = workerPool.all.setupStep(`bar`)

// queue a task on single worker
const singlePromise = workerPool.single.heavyTask(`baz`)
```

## API

### Constructor

```ts
// TypeOfWorkerModule allows to type exposed functions ensuring type safety.
// It will convert sync methods to async and discard/disallow usage of exports
// that are not functions. Recommended to use with `<typeof import("path_to_worker_module")>`.
const workerPool = new WorkerPool<TypeOfWorkerModule>(
// Absolute path to worker module. Recommended to use with `require.resolve`
workerPath: string,
// Not required options
options?: {
// Number of workers to spawn. Defaults to `1` if not defined.
numWorkers?: number
// Additional env vars to set in worker. Worker will inherit env vars of parent process
// as well as additional `GATSBY_WORKER_ID` env var (starting with "1" for first worker)
env?: Record<string, string>
}
)
```

### `.single`

```ts
// Exports of the worker module become available under `.single` property of `WorkerPool` instance.
// Calling those will either start executing immediately if there are any idle workers or queue them
// to be executed once a worker become idle.
const singlePromise = workerPool.single.heavyTask(`baz`)
```

### `.all`

```ts
// Exports of the worker module become available under `.all` property of `WorkerPool` instance.
// Calling those will ensure a function is executed on all workers. Best usage for this is performing
// setup/bootstrap of workers.
const arrayOfPromises = workerPool.all.setupStep(`baz`)
```

### `.end`

```ts
// Used to shutdown `WorkerPool`. If there are any in progress or queued tasks, promises for those will be rejected as they won't be able to complete.
const arrayOfPromises = workerPool.end()
```

### `isWorker`

```ts
// Determine if current context is executed in worker context. Useful for conditional handling depending on context.
import { isWorker } from "gatsby-worker"

if (isWorker) {
// this is executed in worker context
} else {
// this is NOT executed in worker context
}
```

## Usage with unit tests

If you are working with source files that need transpilation, you will need to make it possible to load untranspiled modules in child processes.
This can be done with `@babel/register` (or similar depending on your build toolchain). Example setup:

```ts
const testWorkerPool = new WorkerPool<WorkerModuleType>(workerModule, {
numWorkers,
env: {
NODE_OPTIONS: `--require ${require.resolve(`./ts-register`)}`,
},
})
```

This will execute additional module before allowing adding runtime support for new JavaScript syntax or support for TypeScript. Example `ts-register.js`:

```js
// spawned process won't use jest config (or other testing framework equivalent) to support TS, so we need to add support ourselves
require(`@babel/register`)({
extensions: [`.js`, `.ts`],
configFile: require.resolve(relativePathToYourBabelConfig),
ignore: [/node_modules/],
})
```
41 changes: 41 additions & 0 deletions packages/gatsby-worker/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
{
"name": "gatsby-worker",
"description": "Utility to create worker pools",
"version": "0.0.0-next.0",
"author": "Michal Piechowiak<misiek.piechowiak@gmail.com>",
"bugs": {
"url": "https://github.com/gatsbyjs/gatsby/issues"
},
"dependencies": {
"@babel/core": "^7.14.0"
},
"devDependencies": {
"@babel/cli": "^7.14.0",
"@babel/register": "^7.14.0",
"babel-preset-gatsby-package": "^1.9.0-next.0",
"cross-env": "^7.0.3",
"rimraf": "^3.0.2",
"typescript": "^4.1.5"
},
"homepage": "https://github.com/gatsbyjs/gatsby/tree/master/packages/gatsby-worker#readme",
"keywords": [
"gatsby",
"worker"
],
"license": "MIT",
"main": "dist/index.js",
"repository": {
"type": "git",
"url": "https://github.com/gatsbyjs/gatsby.git",
"directory": "packages/gatsby-worker"
},
"scripts": {
"build": "babel src --out-dir dist/ --ignore \"**/__tests__\" --extensions \".ts,.js\"",
"prepare": "cross-env NODE_ENV=production npm run build && npm run typegen",
"watch": "babel -w src --out-dir dist/ --ignore \"**/__tests__\" --extensions \".ts,.js\"",
"typegen": "rimraf \"dist/**/*.d.ts\" && tsc --emitDeclarationOnly --declaration --declarationDir dist/"
},
"engines": {
"node": ">=12.13.0"
}
}
34 changes: 34 additions & 0 deletions packages/gatsby-worker/src/__tests__/fixtures/test-child.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
export function sync(a: string, opts?: { addWorkerId?: boolean }): string {
return `foo ${a}${opts?.addWorkerId ? ` (worker #${process.env.GATSBY_WORKER_ID})` : ``}`
}

export async function async(a: string, opts?: { addWorkerId?: boolean }): Promise<string> {
return `foo ${a}${opts?.addWorkerId ? ` (worker #${process.env.GATSBY_WORKER_ID})` : ``}`
}

export function neverEnding(): Promise<string> {
return new Promise<string>(() => {})
}

export const notAFunction = `string`

export function syncThrow(a: string, opts?: { addWorkerId?: boolean, throwOnWorker?: number }): string {
if (!opts?.throwOnWorker || opts?.throwOnWorker?.toString() === process.env.GATSBY_WORKER_ID) {
throw new Error(`sync throw${opts?.addWorkerId ? ` (worker #${process.env.GATSBY_WORKER_ID})` : ``}`)
}

return `foo ${a}${opts?.addWorkerId ? ` (worker #${process.env.GATSBY_WORKER_ID})` : ``}`
}

export async function asyncThrow(a: string, opts?: { addWorkerId?: boolean, throwOnWorker?: number }): Promise<string> {
if (!opts?.throwOnWorker || opts?.throwOnWorker?.toString() === process.env.GATSBY_WORKER_ID) {
throw new Error(`async throw${opts?.addWorkerId ? ` (worker #${process.env.GATSBY_WORKER_ID})` : ``}`)
}

return `foo ${a}${opts?.addWorkerId ? ` (worker #${process.env.GATSBY_WORKER_ID})` : ``}`
}

// used in task queue as previous functions would be too often too fast
export async function async100ms(taskId: number, opts?: { addWorkerId?: boolean }): Promise<{taskId: number, workerId: string}> {
return new Promise(resolve => setTimeout(resolve, 100, {taskId, workerId: opts?.addWorkerId ? process.env.GATSBY_WORKER_ID : undefined}))
}
6 changes: 6 additions & 0 deletions packages/gatsby-worker/src/__tests__/fixtures/ts-register.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
// spawned process won't use jest config to support TS, so we need to add support ourselves
require(`@babel/register`)({
extensions: [`.js`, `.ts`],
configFile: require.resolve(`../../../.babelrc`),
ignore: [/node_modules/],
})
Loading

0 comments on commit a82f6db

Please sign in to comment.