Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supporting VM-required Object Semantics / Spaces #53

Open
qinsoon opened this issue Mar 26, 2020 · 7 comments
Open

Supporting VM-required Object Semantics / Spaces #53

qinsoon opened this issue Mar 26, 2020 · 7 comments
Assignees
Labels
A-space Area: Space/PageResource C-enhancement Category: Enhancement F-question Call For Participation: Unanswered question (need more information) P-normal Priority: Normal.

Comments

@qinsoon
Copy link
Member

qinsoon commented Mar 26, 2020

In GitLab by @steveblackburn on Jan 22, 2020, 01:41

Spaces are principally an algorithm-specific (and thus VM-neutral) concept. Therefore the abstraction of spaces should be fully contained within MMTk.

On the other hand, VMs may demand special semantics which imply special spaces (and these demands are orthogonal to the chosen GC algorithm). For example:

  • Jikes RVM has a build-time-constructed 'boot image' which contains objects which although immortal, form part of the regular heap.
  • V8 demands a code space which offers particular semantics (principally that it resides on pages (only) execute and read permissions).
  • V8 demands a 'read only' space into which immutable objects can be written, and once a page is 'sealed' it becomes read only (OS protected). This is primarily for deduplication among isolates or processes. Graal(?) generalizes this further with a copy-on-write space.

These demands raise a few issues:

  • Ensuring that these requirements are supported independent of GC algorithm (the VM is explicitly unsupported for those algorithms for which this can't be done).
  • Ensuring that these requirements are supported without breaking abstractions (i.e. without exposing the concept of spaces to the VM).
  • Ensuring that the support is as general as possible (e.g. that a VM could provide a boot image at a specific address)

Possible solutions:

  • One approach to the Jikes RVM boot image is to keep the concept out of MMTk and instead simply treat the boot image as part of the (opaque) VM root set (and ask the VM to enumerate the pointers when needed). Jikes RVM already supports this, which suggests that we should seriously consider getting rid of all references to the boot image from within MMTk and switch to this just being part of the VM roots. The only issue here is how we treat write barriers. The boot image is different to other VM-specific sources of roots. For example, threads are highly mutated, statics are somewhat highly mutated and the heap (including the boot image) is even less frequently mutated, which influences the way we treat them. We probably would treat the boot image like an older generation of the heap rather than treating it like the stacks (we can't afford to walk the boot image every minor GC).
  • An approach to the other requirements is that we allow allocations to come with particular semantics that may include requirements (such as read only or code) and / or hints (such as 'mature', 'young', 'shared').
@qinsoon qinsoon added the C-enhancement Category: Enhancement label Mar 26, 2020
@qinsoon
Copy link
Member Author

qinsoon commented Mar 26, 2020

In GitLab by @steveblackburn on Jan 22, 2020, 01:41

changed the description

@qinsoon
Copy link
Member Author

qinsoon commented Mar 26, 2020

In GitLab by @steveblackburn on Jan 22, 2020, 01:42

changed the description

@qinsoon
Copy link
Member Author

qinsoon commented Mar 26, 2020

In GitLab by @qinsoon on Jan 22, 2020, 18:57

Though spaces are VM neutral, MMTk exposes Allocator (https://gitlab.anu.edu.au/mmtk/mmtk/blob/master/src/plan/plan.rs#L239), and allocator is internally mapped with spaces. If we need to attach some semantics with each allocation, we can encode those semantics with Allocator. We probably also need some options so MMTk knows what space needs to be created and what allocator is available.

@qinsoon
Copy link
Member Author

qinsoon commented Mar 26, 2020

In GitLab by @caizixian on Jan 22, 2020, 19:08

@qinsoon just a tip that it's better to reference a line via commit hash. Otherwise, the line you are referring to actually moves if there is a change.

https://gitlab.anu.edu.au/mmtk/mmtk/blob/9ae6429d953ba79f93d79ed722f40179628cf8d9/src/plan/plan.rs#L239

@qinsoon
Copy link
Member Author

qinsoon commented Mar 26, 2020

In GitLab by @qinsoon on Feb 7, 2020, 22:06

More insights from Steve:

  • One dimension of what we're discussing here is heap regions that exist independent of the GC algorithm. This is
    an important way to view this. Spaces X, Y, Z must exist, regardless of which GC algorithm I implement (i.e. even
    if I have NoGC, I still need X, Y, & Z and some minimal set of semantics associated with them).
  • The above statement is generally runtime-specific. So for all GC algorithms, when using runtime R, you need
    spaces X, Y & Z, but when using runtime S, you need spaces A & B (independent of GC algorithm).
  • We need to tease out precisely what semantics these spaces are supposed to support. For example, in Jikes RVM, there's an immortal heap (one part of the boot image (data) is essentially just a heap area that is never collected (but is a source of roots). There's also a code space. The same is basically true of V8. However, the semantics are often very muddled. I spent an hour or more with V8 people trying to tease out what semantics they really cared about (they were not really sure and had conflated lots of ideas without being clear about exactly what they needed semantically).
  • It may be the case that the space is opaque to MMTk (ie just a source of 'mysterious' roots, the way the stack is --- the runtime simply tells us "I looked in here and found this pile of roots"). It may also be pre-initialized (like in Jikes RVM's boot image), which has some interesting consequences in terms of object initialization, which needs to be done when the image is created). There's also the question of whether the space is enumerated (like stacks and statics are), or traced.

Furthermore we need to introduce the mechanics that allow the runtime to specify the set of such spaces that it must always have, regardless of GC algorithm.

@qinsoon
Copy link
Member Author

qinsoon commented Jun 5, 2020

#81 and #86 have done some work to separate gc-invariant spaces from specific plans.

It is unclear to me what is left to be done for this issue. Should we close this issue and create separate issues for implementing proper code space and read only space?

@qinsoon qinsoon added A-space Area: Space/PageResource F-question Call For Participation: Unanswered question (need more information) labels Jun 5, 2020
@qinsoon
Copy link
Member Author

qinsoon commented Jun 5, 2020

Also we should not use give all spaces execution permission. #7

@qinsoon qinsoon added the P-normal Priority: Normal. label Jan 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-space Area: Space/PageResource C-enhancement Category: Enhancement F-question Call For Participation: Unanswered question (need more information) P-normal Priority: Normal.
Projects
None yet
Development

No branches or pull requests

2 participants