Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use as as Dynlink-like extension mechanism #2

Open
copy opened this issue Oct 16, 2021 · 4 comments · May be fixed by #5
Open

Use as as Dynlink-like extension mechanism #2

copy opened this issue Oct 16, 2021 · 4 comments · May be fixed by #5

Comments

@copy
Copy link
Contributor

copy commented Oct 16, 2021

I'm working on an OCaml program that can be extended with scripts written in OCaml, however the existing Dynlink mechanism (and higher-level approaches like ocaml_plugin) is too heavyweight and doesn't work in statically compiled programs. I've had a similar implementation in mind (link with the OCaml compiler, generate assembly, assemble it using a custom x86 assembler in memory and make it executable in the running process using mmap/mprotect). I'm curious if this project could be extended to handle this, in particular:

  • Accept complete ml files as input, rather than toplevel phrases
  • Link against the cmi of the main program, so that we can provide a mechanism to register the plugin. This may also be useful for custom toplevels, and I believe the existing code to look up symbols is already able to locale the symbols.
  • In the long run, allow unloading modules

From what I can tell this is technically possible (the old native toplevel already uses Dynlink, and jit_run in this repository is similar to the native dynlink code in OCaml), although I might be missing some difficulties, especially around changes to the OCaml compiler. If there's anything you need, I'd gladly contribute.

@copy
Copy link
Contributor Author

copy commented Oct 18, 2021

After experimenting a bit, I can partially answer my own questions:

  1. Works by wrapping a call to Optcompile.implementation in the existing with_jit_x86 machinery. Calling Tophooks.register_assembler doesn't seem to be necessary (I guess jit_lookup_symbol is only used by the toplevel, but I'm not 100% sure)
  2. Adding a cmi works (in full compilation mode by adding to Clflags.include_dirs, in toplevel mode by adding a #directory … phrase), however symbols from the main executable come back as NULL, even in jittop (ints are printed as 0, functions segfault). Interestingly, the error when a symbol not found is different ("Symbol … refered to by the GOT is unknown"), so something weird is going on here. For instance, after adding let foo = 42 to jittop.ml:
    % dune exec ./bin/jittop.exe
    OCaml version 4.14.0+dev0-2021-06-03 - native toplevel
    # #directory "_build/default/bin/.jittop.eobjs/byte/";;
    # Dune__exe__Jittop.foo;;
    - : int = 0
    
    For static executables, I'll need to manually extend Globals.symbols, since dlsym doesn't work. Should be doable.

@copy
Copy link
Contributor Author

copy commented Oct 18, 2021

Actually, going through a module that isn't the main module (camlDune__exe__Jittop in this case) works. I guess OCaml optimises away the main module, since it can never be referenced.

@NathanReb
Copy link
Contributor

Accept complete ml files as input, rather than toplevel phrases

.ml files are valid toplevel phrases so one thing you could attempt as a simple first prototype is to simply parse said ml file and evaluate it as a toplevel phrase.
There shouldn't be anything particular to do for it to be able to use the libraries linked with the main program although it's probably safer to use -linkall when building it to be sure everything is linked, even the modules that are not used by the main program. For instance it can happen that not the whole standard library gets linked otherwise.

This is quite similar to what MDX does, especially with the 0.2 version of the dune mdx stanza which allows user to link libraries with the MDX executable to be able to use them in their toplevel fragments.

@copy
Copy link
Contributor Author

copy commented Oct 25, 2021

.ml files are valid toplevel phrases so one thing you could attempt as a simple first prototype is to simply parse said ml file and evaluate it as a toplevel phrase.

Interesting, I didn't know that. In any case, I've already removed the dependency on the toplevel bits so I can start testing in vanilla 4.13. The overall changes are fairly small (export Globals, Symbols and Address, add a function for adding symbols, and move some code around, so that with_jit_x86 is exported instead of init_top). Here is my wip fork.

Using the above fork, I find the address of symbols using the following code:

extern void* camlPlugin_api;
extern void* camlInit;
CAMLprim value caml_find_external_symbols(value unit) {
    CAMLparam1 (unit);
    CAMLlocal1(res);
    res = caml_alloc(2, 0);
    Field(res, 0) = caml_copy_nativeint((intnat)&camlPlugin_api);
    Field(res, 1) = caml_copy_nativeint((intnat)&camlInit);
    CAMLreturn(res);
}

And then add to symbols and run plugins as follows:

  external find_external_symbols : unit -> Jit.Address.t array = "caml_find_external_symbols"

  let () =
    Clflags.native_code := true;
    let symbols =
      Array.map2 (fun name addr -> name, addr)
        [|
          "camlPlugin_api";
          "camlInit";
        |]
        (find_external_symbols ())
    in
    Jit.Globals.symbols := 
      Jit.Symbols.union !Jit.Globals.symbols (Jit.Symbols.of_seq (Array.to_seq symbols));

  let run_ml phrase_name source_file =
    Jit.with_jit_x86 (fun () ->
        let output_prefix = Filename.remove_extension source_file in
        let start_from = Clflags.Compiler_pass.Parsing in
        Optcompile.implementation ~backend ~start_from ~source_file ~output_prefix
      ) phrase_name (ref None)

@copy copy linked a pull request Jan 26, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants