Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pooling and request limiting to prevent memory leaks #110

Closed
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
6c12e78
Add pooling and request limiting to prevent memory leaks
seanmorris Sep 14, 2023
e968403
Adding Pool class.
seanmorris Sep 14, 2023
b515f13
Reverting change.
seanmorris Sep 14, 2023
5c7b52d
Removing console.log
seanmorris Sep 14, 2023
a33b0c5
Max request options.
seanmorris Sep 14, 2023
f8326cd
Tweaks.
seanmorris Sep 14, 2023
cb383fe
Tweaks.
seanmorris Sep 14, 2023
ac4ea1b
Initialize all instances
seanmorris Sep 14, 2023
0b9044f
Tweak.
seanmorris Sep 14, 2023
fc0f995
Tweak.
seanmorris Sep 14, 2023
e18b1fd
Consolidating FINALLY block.
seanmorris Sep 15, 2023
0ffaad7
Cleaner pooling algorithm.
seanmorris Sep 15, 2023
3d26067
Comments
seanmorris Sep 15, 2023
909d08a
Comments.
seanmorris Sep 15, 2023
1c91f34
Comments.
seanmorris Sep 15, 2023
a86c4c9
Catch spawn failures
seanmorris Sep 15, 2023
4b634bb
Making private methods privater.
seanmorris Sep 15, 2023
3f7b796
Tweaks.
seanmorris Sep 15, 2023
db3c5d6
Tweaks.
seanmorris Sep 15, 2023
68f6290
Correcting CLI switches, killing instances when done.
seanmorris Sep 16, 2023
9167e77
Detect error in test.
seanmorris Sep 16, 2023
70c3a22
Correcting multi-await
seanmorris Sep 19, 2023
aebc3d5
Correcting multi-await
seanmorris Sep 19, 2023
96a2e82
Scaling back default maxRequests
seanmorris Sep 27, 2023
0cb98b7
Pull request tweaks.
seanmorris Oct 3, 2023
a73874f
Doc Comments.
seanmorris Oct 5, 2023
326b410
Doc Comments.
seanmorris Oct 5, 2023
966a9df
Testing tweaks.
seanmorris Oct 5, 2023
488ad3c
Testing tweaks.
seanmorris Oct 5, 2023
a7708af
Testing tweaks.
seanmorris Oct 5, 2023
e01d6d4
All tests pass.
seanmorris Oct 5, 2023
eca5995
Passing linter
seanmorris Oct 5, 2023
3d8a1c2
Passing linter
seanmorris Oct 5, 2023
271c935
Passing linter
seanmorris Oct 5, 2023
a7581dc
Temporarily skip typecheck
seanmorris Oct 5, 2023
8d968d7
Revert temp change.
seanmorris Oct 5, 2023
04a3e5e
Send 500 error to browser instead of re-queueing request.
seanmorris Oct 6, 2023
65c28e2
Correcting default fatal handler
seanmorris Oct 6, 2023
a1b809e
Abstracting node pools
seanmorris Oct 6, 2023
102adbc
Tweaks.
seanmorris Oct 6, 2023
954d33f
Tweaks
seanmorris Oct 6, 2023
6bc105d
Tweaks
seanmorris Oct 6, 2023
b3c6a08
Tweaks
seanmorris Oct 6, 2023
be0c0cb
Merge branch 'trunk' into sm-preventing-memory-leaks
seanmorris Nov 22, 2023
fabcafa
Bumping versions.
seanmorris Dec 4, 2023
56484b5
Non-working copy operation
seanmorris Dec 4, 2023
c59bea8
Revering extra change.
seanmorris Dec 6, 2023
c133a1e
PR comments.
seanmorris Dec 7, 2023
a657040
PR comments.
seanmorris Dec 8, 2023
75bd3eb
Separating debug flags into its own PR
seanmorris Dec 8, 2023
c3b2269
Incrementing verion numbers
seanmorris Dec 21, 2023
1f9f696
Formatting.
seanmorris Dec 21, 2023
f352bb8
Lint & Tests.
seanmorris Dec 21, 2023
0e6c8ba
Merge conflict.
seanmorris Dec 27, 2023
0ad0b9d
PR comment tweaks.
seanmorris Dec 27, 2023
5d18b81
Restoring codemirror dependency
seanmorris Dec 27, 2023
b09b13b
Revering extra change
seanmorris Dec 27, 2023
0bd1222
PR comment tweaks.
seanmorris Dec 27, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ const dirname = __dirname;

export default async function runExecutor(options: BuiltScriptExecutorSchema) {
const args = [
...(options.debug ? ['--inspect-brk'] : []),
'--loader',
join(dirname, 'loader.mjs'),
options.scriptPath,
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
export interface BuiltScriptExecutorSchema {
scriptPath: string;
debug: boolean;
__unparsed__: string;
} // eslint-disable-line
5 changes: 5 additions & 0 deletions packages/nx-extensions/src/executors/built-script/schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,11 @@
"description": "Path of the script to run.",
"x-prompt": "What script would you like to run?"
},
"debug": {
"type": "boolean",
"description": "Use devtools as a debugger.",
"x-prompt": "Would you like to use devtools?"
adamziel marked this conversation as resolved.
Show resolved Hide resolved
},
"__unparsed__": {
"hidden": true,
"type": "array",
Expand Down
4 changes: 4 additions & 0 deletions packages/wp-now/src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ export interface CliOptions {
port?: number;
blueprint?: string;
reset?: boolean;
maxRequests?: number;
}

export const enum WPNowMode {
Expand All @@ -43,6 +44,7 @@ export interface WPNowOptions {
wpContentPath?: string;
wordPressVersion?: string;
numberOfPhpInstances?: number;
maxRequests?: number;
blueprintObject?: Blueprint;
reset?: boolean;
}
Expand All @@ -54,6 +56,7 @@ export const DEFAULT_OPTIONS: WPNowOptions = {
projectPath: process.cwd(),
mode: WPNowMode.AUTO,
numberOfPhpInstances: 1,
maxRequests: 512,
reset: false,
};

Expand Down Expand Up @@ -111,6 +114,7 @@ export default async function getWpNowConfig(
phpVersion: args.php as SupportedPHPVersion,
projectPath: args.path as string,
wordPressVersion: args.wp as string,
maxRequests: args.maxRequests as number,
adamziel marked this conversation as resolved.
Show resolved Hide resolved
port,
reset: args.reset as boolean,
};
Expand Down
180 changes: 180 additions & 0 deletions packages/wp-now/src/pool.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
import { NodePHP } from '@php-wasm/node';

const Fatal = Symbol('Fatal');
const Spawn = Symbol('Spawn');
const Reap = Symbol('Reap');

let childCount = 0;

export class PoolInfo {
id = childCount++;
requests = 0;
started = Date.now();
active = false;
}

/**
* Maintains and refreshes a list of php instances
* such that each one will only be fed X number of requests
* before being discarded and replaced.
Comment on lines +109 to +111
Copy link
Collaborator

@adamziel adamziel Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any insights about the root cause of the leak? I don't think php-fpm needs to do the same kind of pooling&discarding so it must be something on our side. Emscripten issue perhaps?

Copy link
Contributor Author

@seanmorris seanmorris Oct 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.php.net/manual/en/install.fpm.configuration.php#pm.max-requests

This pathological behavior does exist in native PHP, but won't cause problems nearly as quickly since its not restricted to a single gigabyte of memory.

Emscripten definitely introduces unique challenges. Since we're dealing with a linear, "physical" memory array, as opposed to a virtual memory system afforded by most modern OSes, we're prone to things like memory fragmentation. In that situation, we could have the entire gigabyte empty except for a few sparse allocations. If no contiguous region of memory exists for the length requested, memory allocations will fail. This tends to happen when a new request attempts to initialize a heap structure but cannot find a contiguous 2mb chunk of memory.

We can go as far as debugging PHP itself, and contributing the fix upstream. But even in this case we cannot guarantee that a third party extension will not introduce a leak sometime in the future.

Therefore, we should have a solution robust to memory leaks that come from upstream code. I think that following the native strategy is the best way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh of course, third party extensions! Sean, that's such a good answer, thank you. I'd love to see your GitHub comment as a part of this PR in an actual comment block – it clearly explains the rationale.

We can go as far as debugging PHP itself, and contributing the fix upstream. But even in this case we cannot guarantee that a third party extension will not introduce a leak sometime in the future.

Agreed, let's just stick with discarding PHP instances here.

Copy link
Contributor Author

@seanmorris seanmorris Oct 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/**
* Maintains and refreshes a list of php instances
* such that each one will only be fed X number of requests
* before being discarded and replaced.
*
* Since we're dealing with a linear, "physical" memory array, as opposed to a
* virtual memory system afforded by most modern OSes, we're prone to things
* like memory fragmentation. In that situation, we could have the entire
* gigabyte empty except for a few sparse allocations. If no contiguous region
* of memory exists for the length requested, memory allocations will fail.
* This tends to happen when a new request attempts to initialize a heap
* structure but cannot find a contiguous 2mb chunk of memory.
*
* We can go as far as debugging PHP itself, and contributing the fix upstream.
* But even in this case we cannot guarantee that a third party extension will
* not introduce a leak sometime in the future. Therefore, we should have a
* solution robust to memory leaks that come from upstream code. I think that
* following the native strategy is the best way.
*
* https://www.php.net/manual/en/install.fpm.configuration.php#pm.max-requests
*
*/

*/
export class Pool {
instances = new Map(); // php => PoolInfo

spawner: () => Promise<any>; // Callback to create new instances.
maxRequests: number; // Max requests to feed each instance
maxJobs: number; // Max number of instances to maintain at once.

notifiers = new Map(); // Inverted promises to notify async code of backlogged item processed.
adamziel marked this conversation as resolved.
Show resolved Hide resolved
running = new Set(); // Set of busy PHP instances.
backlog = []; // Set of request callbacks waiting to be run.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same note here on the comments. these are nice to have, but we're so close to JSDoc and if we swap then the comments will follow the variables around the code in an IDE.


constructor({
spawner = async (): Promise<any> => {},
maxRequests = 2000,
maxJobs = 5,
} = {}) {
Object.assign(this, { spawner, maxRequests, maxJobs });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a style note, but I think we would generally prefer to set these like this.spawner = spawner

did you see this pattern elsewhere here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just something I use as shorthand for multiple assignments.

Copy link
Contributor Author

@seanmorris seanmorris Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmsnell This has been refactored:

Object.defineProperties(this, {
maxRequests: { value: maxRequests },
maxJobs: { value: maxJobs },
spawn: { value: spawn },
fatal: { value: fatal },
reap: { value: reap },
});

this[Reap]();
this[Spawn]();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the purpose of using these symbols instead of simply using reap, spawn, and fatal as names?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's just a pattern I use for private methods.

Copy link
Contributor Author

@seanmorris seanmorris Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmsnell This has been refactored:

const spawn = async (pool: Pool) => {
const newInstances = new Set();
if (pool.maxJobs <= 0) return newInstances;
while (pool.instanceInfo.size < pool.maxJobs) {
const info = new PoolInfo();
const instance = await pool.spawn();
pool.instanceInfo.set(instance, info);
info.active = true;
newInstances.add(instance);
}
return newInstances;
};
/**
* Reaps children if they've passed the maxRequest count.
* @param pool the pool object to work on
* @private
*/
const reap = (pool: Pool) => {
for (const [instance, info] of pool.instanceInfo) {
if (pool.maxRequests > 0 && info.requests >= pool.maxRequests) {
info.active = false;
pool.instanceInfo.delete(instance);
pool.reap(instance);
continue;
}
}
};
/**
* Handle fatal errors gracefully.
* @param pool the pool object to work on
* @param instance the php instance to clean up
* @param error the actual error that got us here
* @private
*/
const fatal = (pool: Pool, instance: PhpInstance, error: Error) => {
console.error(error);
if (instance && pool.instanceInfo.has(instance)) {
const info = pool.instanceInfo.get(instance);
info.active = false;
pool.instanceInfo.delete(instance);
}
return pool.fatal(instance, error);
};

}

/**
* Find the next available idle instance.
*/
getIdleInstance() {
const sorted = [...this.instances].sort(
(a, b) => a[1].requests - b[1].requests
);

for (const [instance, info] of sorted) {
if (this.running.has(instance)) {
continue;
}

if (!info.active) {
continue;
}
return instance;
}

return false;
}

/**
* Queue up a callback that will make a request when an
* instance becomes idle.
*/
async enqueue(item: (php: NodePHP) => Promise<any>) {
const idleInstance = this.getIdleInstance();

if (!idleInstance) {
// Defer the callback if we don't have an idle instance available.
this.backlog.push(item);

// Split a promise open so it can be accepted or
// rejected later when the item is processed.
const notifier = new Promise((accept, reject) =>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's resolvers now, it seems natural to rename accept to resolve.

Copy link
Collaborator

@adamziel adamziel Oct 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, it's also used in a few other places around this class. It would be lovely to use the same words to refer to the same concepts, e.g. notifier here is a Promise, but in line 205 a notifier is an {accept, reject} object. Since it comes from resolvers, the name in like 205 could be resolver.

I know these wording changes may seem minor, but naming consistency goes a long way for a new person or when debugging a tricky issue. For example, this Gutenberg PR saved me a ton of hours down the road.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this.notifiers.set(item, [accept, reject])
);

return notifier;
} else {
adamziel marked this conversation as resolved.
Show resolved Hide resolved
// If we've got an instance available, run the provided callback.

// When the provided callback completes, check to see if
// any more requests have been added to the pool
const onCompleted = async () => {
this.running.delete(idleInstance);

this[Reap]();
this[Spawn]();

if (!this.backlog.length) {
return;
}

const idleInstanceNext = this.getIdleInstance();

const next = this.backlog.shift();
const info = this.instances.get(idleInstanceNext);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't ideInstanceNext here be false too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While this function is kicked off by the resolution of a previous promise that SHOULD free up the instance its using, I could see how it would be possible for a FALSE to come in here if we get a request at JUST the wrong millisecond.

This is a little complicated because we're actually managing TWO pools here, one of backlogged requests and one of idle/running PHP instances, but I think I can come up with something robust to that case.


info.requests++;

const request = next(await idleInstanceNext);

request.finally(onCompleted);

request.then((ret) => {
const notifier = this.notifiers.get(next);
this.notifiers.delete(next);
notifier[0](ret);
});

request.catch((error) => {
const notifier = this.notifiers.get(next);
this.notifiers.delete(next);
notifier[1](error);
this[Fatal](idleInstanceNext, error);
});

this.running.add(idleInstance);
};

const info = this.instances.get(idleInstance);

info.requests++;

this.running.add(idleInstance);

const request = item(await idleInstance);

request.catch((error) => this[Fatal](idleInstance, error));

// Make sure onComplete runs no matter how the request resolves
request.finally(onCompleted);

return request;
}
}

/**
* PRIVATE
* Spawns new instances if the pool is not full.
*/
[Spawn]() {
while (this.maxJobs > 0 && this.instances.size < this.maxJobs) {
const info = new PoolInfo();
const instance = this.spawner();
this.instances.set(instance, info);
info.active = true;
}
}

/**
* PRIVATE
* Reaps children if they've passed the maxRequest count.
*/
[Reap]() {
for (const [instance, info] of this.instances) {
if (this.maxRequests > 0 && info.requests >= this.maxRequests) {
info.active = false;
this.instances.delete(instance);
continue;
}
}
}

/**
* PRIVATE
* Handle fatal errors gracefully.
*/
[Fatal](instance, error) {
console.error(error);
if (this.instances.has(instance)) {
const info = this.instances.get(instance);
info.active = false;
this.instances.delete(instance);
}
}
}
6 changes: 6 additions & 0 deletions packages/wp-now/src/run-cli.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,11 @@ export async function runCli() {
'Create a new project environment, destroying the old project environment.',
type: 'boolean',
});
yargs.option('maxRequests', {
describe:
'Max number of requests before refreshing PHP instance.',
type: 'number',
});
},
async (argv) => {
const spinner = startSpinner('Starting the server...');
Expand All @@ -87,6 +92,7 @@ export async function runCli() {
port: argv.port as number,
blueprint: argv.blueprint as string,
reset: argv.reset as boolean,
maxRequests: argv.maxRequests as number,
});
portFinder.setPort(options.port as number);
const { url } = await startServer(options);
Expand Down
14 changes: 9 additions & 5 deletions packages/wp-now/src/start-server.ts
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ export async function startServer(
app.use(compression({ filter: shouldCompress }));
app.use(addTrailingSlash('/wp-admin'));
const port = await portFinder.getOpenPort();
const { php, options: wpNowOptions } = await startWPNow(options);
const { php, options: wpNowOptions, pool } = await startWPNow(options);

app.use('/', async (req, res) => {
try {
Expand Down Expand Up @@ -103,11 +103,15 @@ export async function startServer(
),
body: body as string,
};
const resp = await php.request(data);

const resp = await pool.enqueue((php) => php.request(data));

res.statusCode = resp.httpStatusCode;
Object.keys(resp.headers).forEach((key) => {
res.setHeader(key, resp.headers[key]);
});

Object.keys(resp.headers).forEach((key) =>
res.setHeader(key, resp.headers[key])
);

res.end(resp.bytes);
} catch (e) {
output?.trace(e);
Expand Down
Loading