Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: VM/Workers #10005

Closed
CoryGH opened this issue Apr 4, 2021 · 20 comments
Closed

Feature Request: VM/Workers #10005

CoryGH opened this issue Apr 4, 2021 · 20 comments

Comments

@CoryGH
Copy link

CoryGH commented Apr 4, 2021

A great feature to have would be a VM/Worker Thread which allows for finer control over threads (e.g. start/stop plus pause/resume, rate throttling, etc) along with an event engine attached to a VM/Worker state object. This could be achieved with external libraries in some ways, but would be far more efficient inside the actual Deno threading engine, allowing for things like server-side AWS-styled Lambda processors without having to do something absurd like extract an AST from a piece of code, modify it to wrap blocks in code allowing for hooks, and then recompile to JavaScript before injection into a worker.

@kitsonk
Copy link
Contributor

kitsonk commented Apr 4, 2021

Deno already supports Web Workers. It can support data URLs for the workers, and there is a feature request for blob URLs (#9210). Outside of that, there isn't anything actionable here.

Respectfully closing.

@kitsonk kitsonk closed this as completed Apr 4, 2021
@CoryGH
Copy link
Author

CoryGH commented Apr 4, 2021

Web workers are pretty limited and lack state load/save in conjunction with pause/resume type functionality.

@lucacasonato
Copy link
Member

If you want to do something like this, you should embed Deno in your application and control the isolates in rust. You can use deno_core, and the deno op crates to embed deno in your application.

@ghost
Copy link

ghost commented Jun 25, 2021

Workers are too slow. And you cannot overwrite the window object, its read only. So there is no way to run imported modules in a sandbox. I need this sandbox on every http request, with workers this is extremly slow. So having something like node vm would be important.

@CoryGH
Copy link
Author

CoryGH commented Jun 25, 2021

Workers are too slow. And you cannot overwrite the window object, its read only. So there is no way to run imported modules in a sandbox. I need this sandbox on every http request, with workers this is extremly slow. So having something like node vm would be important.

I tend to agree with this. If Deno is supposed to be a superior node.js, at the bare minimum it should be able to do everything node.js does, just with a more verbose featureset.

@lucacasonato
Copy link
Member

@einicher The Node VM module is not a security sandbox. It should in no scenario be used for security! If you need a security sandbox you have to use workers (or worker_threads in Node).

@ghost
Copy link

ghost commented Jun 25, 2021

@lucacasonato I am not talking about security. Sandbox in a case where you run code separate from the main stack. It does not have to be secure. Just not interfering with the main application. If i import something that writes to the window object, i need that window object (or self) not to be the main applications window object.

EDIT
I found out that imports inside the worker is what slows it down. Now i bundle everything first and run it without internal imports, gets me down from 500ms to 50ms. Wherever i try to use the modern esModule approach, i end up ditching it and bundling everything again. I think i am finally too old to change ;'(

@caspervonb
Copy link
Contributor

caspervonb commented Jun 25, 2021

Workers are too slow. And you cannot overwrite the window object, its read only. So there is no way to run imported modules in a sandbox. I need this sandbox on every http request, with workers this is extremly slow. So having something like node vm would be important.

Workers and isolates have roughly about the same overhead.
Experimented with this for the test runner with potentially running each test in isolation, benchmarked a LOT.
On average we saw about ~10-50ms per instantiation for std modules.

@lucacasonato
Copy link
Member

@einicher Pass --no-check when starting Deno. This should significantly speed up the time it takes to instantiate workers. If you ware looking for isolation in an isolate, Realms are the solution to this. They are not yet ready however.

@ghost
Copy link

ghost commented Jun 26, 2021

@caspervonb its not the worker itself that makes this approach slow, its the whole concept about it: in a worker everything needs to be reloaded from scratch. You cannot share active objects between main application and sandbox. In my case: on every http request the whole cheerio tree gets reimported and rerun (takes 500ms). in a node vm i do a global.jQuery = require('cheerio'); at application startup and pass the jQuery into every vm i create. Its pretty fast and runs in a separate environment where i get to choose what parts of the main application are accessible and which not. So workers are a good option for security, if you really need a sandbox that can only share text values with the main app, but if you want to reuse active parts of the main app, you are pretty f*****. Also its pretty medieval to have a response/request interface to interact with the main app. Do you understand me? There is something missing in the middle. Workers are too strict, regular imports are too open. It would be helpfull just to have a plain empty containerModule where you can decide which main app variables are accessible and have an own global environment (like the workers self) inside.

@lucacasonato
Copy link
Member

lucacasonato commented Jun 26, 2021

@einicher What do you actually intend to achieve with this separate vm context? Ie. what is the problem you are trying to solve?

@caspervonb
Copy link
Contributor

caspervonb commented Jun 26, 2021

Do you understand me? There is something missing in the middle. Workers are too strict, regular imports are too open. It would be helpfull just to have a plain empty containerModule where you can decide which main app variables are accessible and have an own global environment (like the workers self) inside.

That in, the middle bit would be probably be realms:
https://github.com/tc39/proposal-realms
https://github.com/tc39/proposal-ses
https://github.com/tc39/proposal-compartments

@ghost
Copy link

ghost commented Jun 26, 2021

Yes Realms seem to be a very advanced way to achieve this. VMs seem simpler to me, they just give you separate context.

@Soremwar
Copy link
Contributor

@einicher Unless you are doing weird stuff with the global scope, most HTTP libraries don't save state between requests and would "practically" run in a different context from one another. Why don't you just explain what you are trying to achieve in Deno?

@CoryGH
Copy link
Author

CoryGH commented Jun 26, 2021

@caspervonb its not the worker itself that makes this approach slow, its the whole concept about it: in a worker everything needs to be reloaded from scratch. You cannot share active objects between main application and sandbox. In my case: on every http request the whole cheerio tree gets reimported and rerun (takes 500ms). in a node vm i do a global.jQuery = require('cheerio'); at application startup and pass the jQuery into every vm i create. Its pretty fast and runs in a separate environment where i get to choose what parts of the main application are accessible and which not. So workers are a good option for security, if you really need a sandbox that can only share text values with the main app, but if you want to reuse active parts of the main app, you are pretty f*****. Also its pretty medieval to have a response/request interface to interact with the main app. Do you understand me? There is something missing in the middle. Workers are too strict, regular imports are too open. It would be helpfull just to have a plain empty containerModule where you can decide which main app variables are accessible and have an own global environment (like the workers self) inside.

This is nearly identical to my use case. I was considering porting a node.js app I have to Deno if it could solve this issue, the gist of it is:

  1. have a central master thread per machine
  2. have a bunch of workers
  3. workers pull tasks from master thread, or from eachother via sockets already separated out across those workers
  4. workers can be for sockets, for lambdas, etc - they act in a pool
  5. ideally it would be able to spin up a lambda-style program, pause it, save the state, swap contexts to another active lambda, resume that, etc - over the course of seconds, but still relatively quickly with zero time to spin up/down running lambdas to treat it more like a kernel threading engine than and in-process worker which is "execute until complete"

I actually built a piece based on esprima to extract the AST from ECMAScript and build it up into a separate managed runtime environment to get the start/pause/resume/stop functionality with 100% savable state in case of something like a power outage interrupting a task which changes data outside of itself while running - but it would be way simpler to implement with less bloat if there were simply a way to launch a VM and start/pause/resume/stop while pulling the complete state out at any point, without having that high overhead for the switching aspect (higher overhead for initialization is manageable, but having to reload all the state and such when just switching the active task compounds pretty quickly to be unworkable.)

@ghost
Copy link

ghost commented Jun 26, 2021

Aaah, ookay, thats way more sophisticated than what i am talking about :)
I just need separate contexts for incoming http requests like you can do with a node vm.

@lucacasonato
Copy link
Member

I just need separate contexts for incoming http requests like you can do with a node vm.

But why? What do contexts give you compared to not creating any contexts (staying in the same context).

@ghost
Copy link

ghost commented Jun 26, 2021

But why? What do contexts give you compared to not creating any contexts (staying in the same context).

http requests come in parallell, if they use the same context they overwrite each other and mix up their global variables.

@lucacasonato
Copy link
Member

@einicher They come in concurrently, not in parallel. A single Javascript isolate only runs on a single thread, so no two pieces of code can ever run at once. Yielding from one piece of code can only happen on call stack returns or async await points. If you only access cheerio synchronously, you can not have multiple requests interfere. In fact, creating contexts does not help you at all in a scenario with a single require('cheerio'), because a single object would be shared between all contexts, because in Javascript objects are passed by reference, not by value. If you want to dive into this further, maybe hop on Discord: https://discord.gg/deno

@wperron
Copy link
Contributor

wperron commented Jun 28, 2021

This conversation is getting off-topic, and there's nothing more actionable for us, so I'll be converting to a discussion

@denoland denoland locked and limited conversation to collaborators Jun 28, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants