[Proposal][Question] Run-time paging of bytecode 

We are investigating the implementation of a _paging_ mechanism for jerryscript bytecode. The goal is to minimize the RAM overhead when executing snapshots by dynamically loading bytecode in fixed-size pages. By doing this we hope that we can execute snapshots that are at least as large as the amount of memory available to the Jerryscript interpreter, and set expectations on the amount of memory available to the executing application.

While planning out a potential implementation of this, I have run into some roadblocks that I would appreciate some help to overcome, those are listed at the end of this post. My thoughts on a potential implementation are as follows:

**Saving a snapshot**
- Add a new `CBC_CODE_FLAGS_SNAPSHOT` to each bytecode header. This would make it clear during execution whether we need to account for paging or not.

**Loading a snapshot**
- In `jerry_exec_snapshot()` and `snapshot_load_compiled_code()`, we want to avoid any modification of the snapshot, either in place or into newly allocated blocks.
- Linking of CBC Literals to the global literal table should be done as late as possible.
  - With my understanding, this requires keeping around the literal map (`lit_map_p`) in RAM so that this can be done efficiently.
- In `snapshot_load_compiled_code()` we also do not want to recurse on template literals. The loading of this compiled code should be deferred until we need to create an object for these literals.

**Executing a snapshot**
- In order to enable paging of bytecode, rather than using a pointer to the bytecode, we would need to abstract this away to some concept of a program counter.
- In `vm_run()`, the `vm_frame_ctx_t` would be constructed with this program counter for `byte_code_p` and `byte_code_start_p`.
- In `vm_loop()` and the rest of the pipeline, we ask the underlying system to translate that program counter as necessary.

In my design investigation, I see many issues when looking through the execution stages. A few of the larger issues are:
- `vm_init_loop()` wants to create a literal object for all template literals in the frame. This requires our bytecode to be loaded, which we don't want to load until use. This is also a more general problem with the creation of function object which we do not want to use right away.
- Transparent pointers to `ecma_compiled_code_t` are used all over the place to access the bytecode, and we would like to abstract this away.
- Trying to maintain the common codepaths for paged and non-paged execution (directly from source, `eval()`, etc).

My thoughts on a good first step would be to only perform paging of the `CBC_INSTRUCTION_LISTS`. When loading the snapshot, we would load the literal tables as is currently done, but when setting `CBC_SET_BYTECODE_PTR` we would set a PC value in there and implement a translation step in `vm_loop()`. All of the literal tables would still have to sit in memory, but it would allow for paging of the bytecode itself.

Unfortunately, due to things that I've mentioned above I see no clear path to achieving paging of literal tables along with bytecode. It is quite likely that there are some things in Jerryscript that I have overlooked when trying to wrap my head around this, and I would appreciate any feedback on the approach and roadblocks I've mentioned here, as well as thoughts from the Jerryscript team on a feature of this nature.

Thank you


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Proposal][Question] Run-time paging of bytecode #1351

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Proposal][Question] Run-time paging of bytecode #1351

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions