Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load native modules #519

Open
d3x0r opened this issue Jan 16, 2025 · 9 comments
Open

Load native modules #519

d3x0r opened this issue Jan 16, 2025 · 9 comments

Comments

@d3x0r
Copy link

d3x0r commented Jan 16, 2025

I've come to understand that an isolated-vm isolate couldn't load isolated-vm.node. Which is why 'modules (no)' is listed in the readme.

vm2 is deprecated, and points to this as a solution.

Mostly, I can load all the modules I want using isolated-vm... until I get to .node extension... which then I'd need some sort of process.dlopen()

The loader for require( "*.node") is pretty simple...
https://github.com/nodejs/node/blob/main/lib/internal/modules/cjs/loader.js#L1927-L1930

it just uses this public API...
https://nodejs.org/api/process.html#processdlopenmodule-filename-flags

which eventually gets into a native function....
https://github.com/nodejs/node/blob/main/src/node_binding.cc#L439-L573 (where the work actually gets done)

Which as you should know since this is itself a native module.... the DLL gets called with a specific entry point and the context an exports object to fill in. (entry point is something like node_register_module_v127 )

Which should be able to get the isolate the context is in; and run fully against the v8 isolate... My library https://www.npmjs.com/package/sack.vfs is a single dependency that provides system access to system things node would... but this doesn't use Node very much at all, other than the entry hooks.

Maybe in the future rewrite? That DLOpen function seems overly complex... think it has a lot of legacy behavior that isn't really needed.

(Unless of course I missed something somewhere? )

@laverdet
Copy link
Owner

The readme comparison is pretty old and out of date. Also "modules" there refers to ecmascript modules (which are supported by the way). It sounds like you are talking about native modules though.

isolated-vm is a dramatically different environment than nodejs so you can't expect c++ code to just work out of the box. Native modules have been supported since the beginning but you need to modify the source and compile from scratch.

Here are several examples:
https://github.com/screeps/driver/blob/master/native/src/main.cc
https://github.com/laverdet/ivm-inspect/blob/main/binding.cc
https://github.com/laverdet/isolated-vm/blob/main/native-example/example.cc

Taking a look at sack.vfs I think you will have a hard time getting this to work in isolated-vm. It makes heavy use of the libuv event loop so you would need to spin up a new loop for your isolate. It doesn't use any node-specific APIs though so I believe it's possible, but you will need to get your hands dirty.

@d3x0r
Copy link
Author

d3x0r commented Jan 17, 2025

I was just considering amending the original to say I do use libuv - so I need to get the default event loop somehow - I also need the magic hook that's run at the end of native code that triggers promises...

// JS code
if (process._tickDomainCallback || process._tickCallback)
    // which passes the JS function to the native library to store in a Permanent<Function>
    sack.Thread(process._tickDomainCallback || process._tickCallback);

That is automagically called by the NaN API classes. They were quite strict and insistent that I NOT use that ; but really it must be used, but it must only be used when returning back to the event loop, and never at any other point.

Yes it is a different environment...

yes a external build and link might be a thing to do...

I don't have my own event loops - I register against the default event loop, and sometimes unref the handles so they don't keep the process open.

(side note) https://github.com/d3x0r/jsox is a project I did to serialize JS objects - potentially with custom tags to specify classes to revive objects as... but out of the box supports cyclic objects; and is what I've been using for protocol to the VM before...

// node has a function to get the current event loop for an isolate
// this supports worker-threads since it returns the separate isolate
uv_loop_t*  node::GetCurrentEventLoop( isolate );


// for shutdown, node used to have AtExit that would get called (sometimes) 
// but now has a per-worker unload that can happen  (isolate,callback,userdata passed to callback)
#if ( NODE_MAJOR_VERSION > 9 )
	node::AddEnvironmentCleanupHook( isolate, CleanupThreadResources, c );
#else
	node::AtExit( moduleExit );
#endif

// and if I just don't include node.h ... 
// then I just had to define these(and prototypes for the functions above) to build.

#define NODE_MAJOR_VERSION 23
#define NODE_MODULE_INIT()                                                                                             \
	__declspec(dllexport) void init( v8::Local<v8::Object> exports, v8::Local<v8::Value> module, v8::Local<v8::Context> context )

But - I see there's a NativeModule that's not mentioned...

so does the GetCurrentEventLoop just work then?
https://github.com/laverdet/isolated-vm/blob/main/native-example/usage.js#L9


What I do want is time accounting; wall clock and cpu clock are excellent parameters...
I'm not sure how to kill a task when it is running too long or otherwise....
I also want to give a specific context to this level of object which is its own environment;
with node VM eventually I have to use require() especially on a .node which runs it in the wrong context.
in vm if you use the import() function you get a chance to do linking yourself, so mostly I can just tell the VM to run
the appropriate code in the right sandbox (don't know why this couldn't just have been applied to the import that a VM has... but that runs in the outer module's context). That and mixed-mode where a MJS loads a CJS, which is run synchronously - but is easy enough to implement in a linker for isolated VM - I had most of it working. Somewhat what a context would get would be a websocket connection without necessarily being able to create one itself; I have the ability to 'throw' a socket to another isolate, since it's really just a pointer, and a small wrapper object in the V8 context/isolate.

@laverdet
Copy link
Owner

so does the GetCurrentEventLoop just work then

No, it does not. isolated-vm runs isolates in a thread pool and libuv wants to be single-threaded.

I forgot to mention, why don't you just delegate these functions out to nodejs? Surely you don't want to inject the entire module into your isolate, that defeats the purpose of isolation since sack looks like it does a lot of things.

@d3x0r
Copy link
Author

d3x0r commented Jan 18, 2025

I appreciate if you will bear with me a bit, and maybe provide insights... while I understand it seems like there's a better way that's not doing it at all how I am; I think this isn't so far from possible.

I don't have a minimal example that would work; the include native-example doesn't build on windows; isolate_vm doesn't export any of the symbols that the example uses... ( probably does on linux)

  1. setup isolate, Context, global... run code in the VM to setup a shell of require , that eventually calls from the VM to the "require" that was set on the global global.setSync('require', ivmRequire );
  2. call the module linker, which loads prerun.mjs, and goes into module resolution
    a) import sack from "sack.vfs" (as a module, loads vfs_module.mjs)
    b) vfs_module.mjs does import {sack} from "./vfs_module.cjs" which switches from a module to common JS
    c) vfs_module.cjs does require( "sack_vfs.node" ) which the linker function uses the native loader, and gets a NativeModule, the result of the run is the export object passed for the native plugin to initialize. This finishes, and the VM can log Object.keys of that (log cannot transfer objects with functions in them, and the export object is almost 100% functions).
    d) after loading the addon, vfs_module.cjs does a require( "./sack-jsox.cjs" )( sack ); which ends up going out, loading the file, running the file, it has no requires; so it results with module.exports = function() {} , and the require() in the globalThis of the VM results with the function. But when I call this function with 'sack' which the resulting function was executed in the isolated-vm, sack was a native module loaded in the VM, somehow that function call is trying to clone the sack object passed to it.

If I instead pass "not sack!" to the function, I get [ [0],[1],[2],[3],[4],[5],[6],[7],[8]] as the parameter in the function.

I don't understand why functions would call out of the VM just to call another function already in the VM...

(and yes, I expect you're must more lost than I am reading through the list of events above, I've been trying to figure out a way to represent (in VM) (out of VM) and where the code is running.

I'm tempted to just use Function() to execute the require() code (that's not a .node module).

maybe I need to dereference it or reference it?

I ended up just using Function() to evaluate sources for a moment, this gets me everything loaded, including the .node module, but the vfs_module.mjs import of vfs_module.cjs - the calling of vfs_module is synchrounous - but I don't get back a Module ; how do I copy the methods from module.exports into a Module I can return to resolve the linker loading vfs_module.mjs?

I know the main page says

"
There is only 1 frequently asked question:

"How do I pass a [module, function, object, library] into an isolate?"

You don't! Isolates are isolated.
"

but this is just passing the VM result back to the VM

@laverdet
Copy link
Owner

It's definitely the frequently asked question. You need to make a shim.

@d3x0r
Copy link
Author

d3x0r commented Jan 19, 2025

Loading the require()'d code with Function fixed a lot of issues - was able to get back the function, and pass an object to the function appropriately - not sure why that would really be any different than what's done with script vm utilities.

Adding an entry point in the package.json for 'vm', which loads everything as an ESM module or native module works. (I'd prefer the default to be a module load path anyway; but there's issues going from .cjs to .mjs but not the other way around)

I suppose the namespace in a module ends up filled in by v8? If I just treated everything like a module .cjs files don't do an export default; and it wouldn't be practical to inject based on finding 'exports.' in the code... but iterating the keys of the result would be pretty simple.

@d3x0r
Copy link
Author

d3x0r commented Jan 20, 2025

Only thing missing at this point is the ability to queue events to the isolated-vm... Since nothing else is exported from isolated-vm.node adding an export that I could just load that library and get an entry point similar to node... but then maybe extending the native module initialization to load an alternative entry point, which gets passed the uv_loop_t also would be easier...

d3x0r@45d2d08 This is the diff that would do that - had to add a method to UvScheduler to get the loop; but mostly small changes to native module.

@laverdet
Copy link
Owner

The isolated scheduler does not use libuv. There is no loop to return since isolates run in a thread pool it is inherently incompatible with libuv. The loop you are returning there is the nodejs loop which can easily be acquired with node APIs.

@d3x0r
Copy link
Author

d3x0r commented Jan 24, 2025

Ok. I did some more digging. To dispatch from one thread to a vm thread, it looks like I would need something like

struct wssAsyncTask : v8::Task {
	wssObject *myself;
	wssAsyncTask( wssObject *myself )
	    : myself( myself ) {}
	void Run() { 
		wssAsyncMsg__( this->myself ); 
		Isolate *isolate = Isolate::GetCurrent();
		isolate->PerformMicrotaskCheckpoint();
	}
};

and then

   // probably have to grab this when the JS runs something in the native interface
   ivm::IsolatedEnvironment::GetCurrent()->GetCurrentHolder()  ->ScheduleTask( &wss->task, ... );

wssAsyncMsg__ is the regular old handler that basically just takes its own object, previously received from the async->data handle... it build a stack scope etc... I suppose the PerformMicrotaskCheckpoint() is maybe what the process._tickCallback node has eventually results with(?).

back to the modified entry point idea - passing the IsolatedEnvironment might be an option; or just the IsolateHolder; or maybe the function pointer... well that would still need something for a 'this' .

And a few __declspec(dllexport) decorators...

Which I noticed is defined for the NativeInit sort of definition - there's a 'isolated_vm.h' in src, which nothing uses; sort of looks like the minimal interface I'm asking for from above...

dllexports in classes don't need the extern "c" attribute... (and shouldn't have)

__imp_?ScheduleTask@IsolateHolder@ivm@@QEAAXV?$unique_ptr@VTask@v8@@U?$default_delete@VTask@v8@@@std@@@std@@_N11@Z

hmm that signature looks suspiciously like it'll try to delete the task? ivm::IsolateHolder::ScheduleTask(class std::unique_ptr<class v8::Task,struct std::default_delete<class v8::Task> >,bool,bool,bool) __ptr64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants