engine: replace workerpool #412

josephjclark · 2023-10-27T10:38:41Z

Workerpool is not going to fly for us because:

We don't have two-way communication with the worker, so we cannot lazy load credentials and dataclips
We do not have control over purging specific threads to clean memory leaks or isolate modules
A future worry but we have no capacity to re-use a specific thread, like re-running a workflow in the same thread twice to reduce module initialisation time. IF we can safely isolate modules (even in the sandbox), a warm thread/environment is a great optimisation
Running two workflows in the same pooled thread is insecure (and may be unstable, depending on module initialisation)
If a worker blows up, the parent thread - the whole engine! - also dies

We need to investigate threads and piscina (or whatever), or maybe other libraries, so that we get the control we need.

We need to consider performance, stability and security. I'd like to have some kind of benchmark for this (a benchmark integration test would be nice - run the same 10 workflows 50 times and measure the total time).

josephjclark · 2023-11-28T14:51:19Z

Some performance stats with workerpool:

Workers	# attempts	duration	max job memory (mb)	max system memory (mb)
50	100	4.555	23	1612
20	100	3.79	22	979
5 (default)	100	3.295	23	452
1	100	41.377	19	362

This is based on 0336d31

josephjclark · 2023-12-04T09:29:18Z

It's becoming increasingly apparent that workerpool or worker threads won't work for us. It is not secure or stable enough.

I've re-estimated (probably naively) based on this plan:

Add a quick-and-dirty "child_process" pool implementation
Compare benchmarks of a chlid-process pool vs a worker pool.
Unless the benchmarks are really really good (and I doubt it!), implement a pool of long-running child processes which spin off worker threads for each attempt. This balances isolation, stability and performance.

josephjclark · 2023-12-05T14:29:25Z

So the plan is:

A quick implementation of a fire-and-destroy child process runner in the engine, with benchmarks
We expect that to run a bit slowly so we're likely to go straight ahead and implement this model:

josephjclark added priority labels Oct 27, 2023

This was referenced Oct 27, 2023

engine: workerpool shares module scope #410

Closed

runtime-manager: lock down pooled threads #120

Closed

josephjclark removed the priority label Nov 2, 2023

josephjclark removed the rtm (tmp) label Nov 21, 2023

josephjclark closed this as completed Jan 25, 2024

josephjclark reopened this Jan 25, 2024

josephjclark mentioned this issue Jan 25, 2024

Fancy new engine #547

Merged

7 tasks

josephjclark closed this as completed Jan 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

engine: replace workerpool #412

engine: replace workerpool #412

josephjclark commented Oct 27, 2023 •

edited

Loading

josephjclark commented Nov 28, 2023

josephjclark commented Dec 4, 2023

josephjclark commented Dec 5, 2023

engine: replace workerpool #412

engine: replace workerpool #412

Comments

josephjclark commented Oct 27, 2023 • edited Loading

josephjclark commented Nov 28, 2023

josephjclark commented Dec 4, 2023

josephjclark commented Dec 5, 2023

josephjclark commented Oct 27, 2023 •

edited

Loading