Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

engine: workerpool shares module scope #410

Closed
josephjclark opened this issue Oct 26, 2023 · 5 comments
Closed

engine: workerpool shares module scope #410

josephjclark opened this issue Oct 26, 2023 · 5 comments
Assignees

Comments

@josephjclark
Copy link
Collaborator

josephjclark commented Oct 26, 2023

The problem:

See #404 (comment) and #404 (comment) (Maybe I'll write this up again later)

We need to ensure that each run executes with a clean module scope, ie, any intialisation code is re-run before the workflow starts.

Possible list of solutions:

  • Append a UUID to each module import to spoof the module loader. The compiler needs to generate the correct import module name, the adaptorPaths passed to the runtime need to contain the right mapping. I think the repo needs to alias the module name too. This won't work if node is smarter than us.
  • Find some way to unload the module in node.js itself
  • Destroy each worker thread after its been used (or, more drastically, don't user workerpool at all but create a new thread each time - it's fairly quick!). Workerpool has a terminate thread API.
  • Find some way to alias or spoof the import in the linker. This seems unlikely.
@josephjclark
Copy link
Collaborator Author

josephjclark commented Oct 26, 2023

Does burning the thread even help?

yes: each thread has its own module context, so if we can burn the thread, we'll get a clean new module in the next one.

@josephjclark josephjclark self-assigned this Oct 26, 2023
@josephjclark
Copy link
Collaborator Author

Calling workerpool.terminate() will a) kill all active threads (but it'll wait for them to return so this is fine) and b) cancel all pending threads.

This second bit is a real problem.

@josephjclark
Copy link
Collaborator Author

Working around the cache with a UUID might not be too hard - if we dynamically import with a query, id `import('common?cachebuster=abc'). I think this bypasses node's module cache but will still resolve to the correct file.

Difficulty: I think this will leak memory because the old module won't be garbage collected. That again tells me we should burn threads regularly.

Ok, so what if the worker has its own sort of GC and, on an interval, says "if there are no pending tasks, terminate all threads". That means we can cache bust but also free up memory and never lose a pending job.

@josephjclark
Copy link
Collaborator Author

Really interesting thread about unloading: nodejs/tooling#51

Using our own loader which doens't cache and re-evaluates the module is a good solution. It won't reclaim memory. Adding a cache busting URL is gonna be way easier.

I think the previous comment is the winner here.

@josephjclark
Copy link
Collaborator Author

Ok: With a cache busting import we are secure and safe.

We need to look into #411 super urgently and #412 before release - but the immediate short term issue is resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant