Skip to content
This repository has been archived by the owner on Aug 2, 2019. It is now read-only.

Mu Loadable Format (MuLF) #30

Open
wks opened this issue Mar 31, 2015 · 0 comments
Open

Mu Loadable Format (MuLF) #30

wks opened this issue Mar 31, 2015 · 0 comments

Comments

@wks
Copy link
Member

wks commented Mar 31, 2015

This proposal describes an extended code delivery unit of the Mu VM.

Rationale

A "standalone Mu IR" (if there is such thing) needs more than a code bundle to run. They include:

  • A way to allocate and initialise heap objects at load time (addressed by the HAIL format. See Heap Allocation and Initialisation Language (HAIL) #29 )
  • Embedded binary native programs (for example, native libraries, a native client, or even the Mu VM itself)
  • Static dependencies to other units of loading (MuLF files).
  • (optionally) An entry point to start execution.

Existing mechanisms can perform all of the above because the client has total control over the Mu VM. This proposal only gives a "standard" format to do so.

Proposal

This new unit of loading is called Mu Loadable Format (MuLF).

The MuLF file sample

This proposal uses XML as the human-readable format. It is obviously not ideal.

<mulf>
  <dependency kind="mulf" name="uvm.std.io" />
  <dependency kind="native" name="libc.so" />
  <muir-bundle>  <!-- the code section --> <![CDATA[
    .typedef @i64 = int<64>
    .typedef @i8 = int<8>
    .typedef @string = hybrid<@i64 @i8>
    .typedef @ref_string = ref<@string>
    .typedef @array_ref_string = hybrid<@i64 @ref_string>
    .typedef @ref_array_ref_string = ref<@array_ref_string>

    .const @I64_0 <@i64> = 0   // to be initialised in the next section

    .global @helloWorld <@ref_string>

    .funcsig @main_sig = @i64 (@ref_array_ref_string)
    .funcdef @main <@main_sig> (%args) {
      %entry:
        %hw = LOAD <@ref_string> @helloWorld
        CALL <@println_sig> @println (%hw)
        RET <@i64> @I64_0
    }
  ]]>
  </muir-bundle>
  <heap-initialise format="hail">  <!-- the "heap section" --> <![CDATA[
    .newhybrid $hw_obj <@string>
    .init $hw_obj = {12, {'H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '!'}}
    .init @helloWorld = $hw_obj  // Assign this object to the global cell @helloWorld
  ]]>
  </heap-initialise>
  <binary kind="native-code"> 
    ...
  </binary>
  <binary kind="native-data">
    ...
  </binary>
  <initialiser function="@main" synchronous="true">
    <param kind="cmdline-args-as-standard-string-ref-array" />
  </initialiser>
</mulf>

MuLF Clients

As other programming languages, this format is handled by a MuLF client. This is to keep the core Mu VM minimal.

  • The Mu micro VM provides the Mu Client API, which can load Mu IR bundles and HAIL files, as well as mechanisms to create Mu stacks and Mu threads.
  • The MuLF client handles the MuLF format. The client loads and parses the MuLF file, invokes the API calls to load the included bundle and the included HAIL file, creates stacks and threads to execute the initialisation functions and perform necessary synchronisation to make sure the initialisation functions finish before "other parts" (the meaning depends on the concrete high-level language) can run.

MuLF sections

  • dependencies: references to other MuLF files or native libraries
  • uir-bundle: a Mu IR bundle
  • heap-initialise: a HAIL file which initialises the heap
  • binary: embedded binary data
  • initialiser: a function to be executed after loading the MuLF bundle

The loading process

The MuLF client shall process dependencies before loading the current MuLF file.

TODO: The Mu IR is designed not to allow circular dependencies (just put multiple Mu IR bundles into one big bundle so circular types and function calls can be resolved). But whether MuLF allows circular dependencies is an open question.

Then the MuLF client loads the binary section, then the uir-bundle section, then the heap-initialiser section.

Finally the MuLF executes the initialisers in the order they are declared. Each initialiser is executed in a new Mu stack and a new Mu thread. If an initialiser is marked as "synchronous", the loading process pauses and waits for the initialiser function to return. But this does not prevent the initialisers to trigger traps to the client and result in other Mu IR bundles or MuLF files to be loaded.

Then the loading process finishes. The Mu VM will continue executing until the last Mu thread is killed.

The relation between the MuLF client and the higher-level language client

The MuLF client can be considered as a "client-level library" which helps a higher-level client which implements a language.

Conversely the higher-level client can be considered as a library in the MuLF "framework": implementing language-specific things reactively as "call-backs" from the MuLF client.

ELF compatibility

Using the standard ELF format will bring many profits, including making use of existing system facilities. It is possible to make a MuLF file a self-contained executable file.

Open questions

Do we standardise this MuLF as part of the Mu VM specification?
Probably yes. It should provide a standard way to load "something more than a code bundle". However, this MuLF format is more oriented to the traditional ahead-of-time "linker-loader" model rather than the JIT compiling model. Complex languages (like Java) may wish to precisely control its loading process (e.g. loading more than one circularly-related classes and submit them in one huge Mu IR bundle and initialise heap objects together). In this case, MuLF is not as useful as ahead-of-time compiled programs.

We may make MuLF an "optional component" of the Mu VM. Some very tiny Mu implementation may not have it, but anyone who claims to implement MuLF shall do it in the standard-compliant way.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant