Skip to content

[Proposal][Question] Run-time paging of bytecode  #1351

@mjessome

Description

@mjessome

We are investigating the implementation of a paging mechanism for jerryscript bytecode. The goal is to minimize the RAM overhead when executing snapshots by dynamically loading bytecode in fixed-size pages. By doing this we hope that we can execute snapshots that are at least as large as the amount of memory available to the Jerryscript interpreter, and set expectations on the amount of memory available to the executing application.

While planning out a potential implementation of this, I have run into some roadblocks that I would appreciate some help to overcome, those are listed at the end of this post. My thoughts on a potential implementation are as follows:

Saving a snapshot

  • Add a new CBC_CODE_FLAGS_SNAPSHOT to each bytecode header. This would make it clear during execution whether we need to account for paging or not.

Loading a snapshot

  • In jerry_exec_snapshot() and snapshot_load_compiled_code(), we want to avoid any modification of the snapshot, either in place or into newly allocated blocks.
  • Linking of CBC Literals to the global literal table should be done as late as possible.
    • With my understanding, this requires keeping around the literal map (lit_map_p) in RAM so that this can be done efficiently.
  • In snapshot_load_compiled_code() we also do not want to recurse on template literals. The loading of this compiled code should be deferred until we need to create an object for these literals.

Executing a snapshot

  • In order to enable paging of bytecode, rather than using a pointer to the bytecode, we would need to abstract this away to some concept of a program counter.
  • In vm_run(), the vm_frame_ctx_t would be constructed with this program counter for byte_code_p and byte_code_start_p.
  • In vm_loop() and the rest of the pipeline, we ask the underlying system to translate that program counter as necessary.

In my design investigation, I see many issues when looking through the execution stages. A few of the larger issues are:

  • vm_init_loop() wants to create a literal object for all template literals in the frame. This requires our bytecode to be loaded, which we don't want to load until use. This is also a more general problem with the creation of function object which we do not want to use right away.
  • Transparent pointers to ecma_compiled_code_t are used all over the place to access the bytecode, and we would like to abstract this away.
  • Trying to maintain the common codepaths for paged and non-paged execution (directly from source, eval(), etc).

My thoughts on a good first step would be to only perform paging of the CBC_INSTRUCTION_LISTS. When loading the snapshot, we would load the literal tables as is currently done, but when setting CBC_SET_BYTECODE_PTR we would set a PC value in there and implement a translation step in vm_loop(). All of the literal tables would still have to sit in memory, but it would allow for paging of the bytecode itself.

Unfortunately, due to things that I've mentioned above I see no clear path to achieving paging of literal tables along with bytecode. It is quite likely that there are some things in Jerryscript that I have overlooked when trying to wrap my head around this, and I would appreciate any feedback on the approach and roadblocks I've mentioned here, as well as thoughts from the Jerryscript team on a feature of this nature.

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionRaised questionsnapshotRelated to the snapshot feature

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions