Object Pinning

This issue is part of https://github.com/microvm/microvm-meta/issues/24
# TL;DR

This proposal gives meaning to the "object pinning" operation.

The meaning is: The PIN operation takes an `ref<T>` or `iref<T>`, pins the object for the current thread, and returns a `ptr<T>` (pointer to `T`). This pointer **can be used to** access the memory location of the `iref` until all threads which have pinned the object unpinned it using UNPIN operations.

Note: This has very few implications to the Mu implementation. It only says the pointer can be _used_ in the expected way, but does not say anything about the storage of the actual object. (The micro VM can cheat!)
## Operations

In the following two instructions, `R` can be either `ref` or `iref`.
- `PIN(%r: R<T>) -> ptr<T>`: Add the object referred by `%r` to the _pinning set_ of the current thread, and return a pointer.
- `UNPIN(%r: R<T>) -> void`: Remove the object referred by `%r` from the _pinning set_ of the current thread.

`PIN` and `UNPIN` do not pin any object if `%r` refers to a memory location not in any heap object. If `%r` is NULL, `PIN` returns a NULL pointer. If `%r` is an `iref` and refers to a stack cell or a global cell, `PIN` returns a pointer to it.

> NOTE: All memory locations in Mu, not just heap objects, are referred by `iref`. In order to let native code work with the Mu memory, pointers always have to be generated. That is why `PIN` and `UNPIN` trivially work with non-heap memory locations as well. It may be impossible at compile time to know whether an `iref` refers to the heap. For example, there may be a function taking an `iref` as a parameter.
## The guarantees

The pointer returned by `PIN` has the following guarantees:
- The pointer is usable as long as the object pinned by `PIN` is in the _pinning set_ of **any** thread.
- The pointer points to a region of address which can be used to access the memory location of the parameter of `PIN` (i.e. `%r`). Specifically:
  - The object layout conforms to the platform's Mu Application Binary Interface (yet to be defined).
  - The native code can perform LOAD, STORE, CMPXCHG, ATOMICRMW, FENCE operations on those locations and they shall conform to the Mu memory model. However, _which native instruction/operator/function performs which operation in the Mu memory model_ is implementation defined.

> One memory order can be implemented in multiple different ways. e.g. on x86, SEQ_CST can be implemented as (load: MOV, store: XCHG), but also (load: LOCK XADD(0), store: MOV). It is the implementation to guarantee the Mu memory operation (Mu IR instructions) is compatible with the native counterparts (C11 `<stdatomic.h>` or C++11 `<atomic>`). For example, one particular implementation may let the `atomic_load(ptr, memory_order_xxxxxx)` function in glibc (but not `atomic<T>.load(xxxx)` libc++ provided by LLVM) to perform the LOAD operation in xxxxxx memory order in the Mu memory model.
## Issues about multi-threading

It is possible for two threads to pin the same object. For example, there are two threads T1 and T2 and object O. The execution appears like the following sequence:
1. T1: pin O
2. T2: pin O
3. T1: do something with O
4. T1: unpin O
5. T2: do something with O
6. T2: unpin O

In step 5, T1 has performed an unpin operation. If an object can be pinned from one thread but unpinned by another thread, then there will be a problem: If the object O is no longer pinned, it will be an error if T2 do anything to the pointer.

It is possible to require a thread to acquire a lock or perform reference counting before pinning/unpinning, but this will be inefficient because this inevitably involves expensive atomic operations. But one reason for using the FFI is performance.

Therefore, we let different threads to pin/unpin an object **locally**: `PIN` means pinning an object **for the current thread**. An object is pinned if and only if at least one thread is pinning it.

Implementation-wise, this can be done by keeping a thread-local buffer which records all objects the current thread is pinning. When GC happens the marker looks at the thread-local buffers to find all objects pinned by any thread. In this way, mutators do not need atomic memory operations, but the GC needs to look at all threads.

This "thread-local pinning" mechanism cannot be implemented by the client if the `PIN` instruction in Mu is racy. Giving the client access to the thread-local buffer is no different from the thread-local `PIN` instruction. So this thread-local pinning mechanism does not violate the principle of _minimalism_ of Mu: it cannot be implemented efficiently outside Mu.

-----------------------------CUT HERE. BELOW ARE LEGACY TEXTS-------------------------------
# Abstract

I propose defining two kinds of memory spaces: _real space_ which models the memory used by C or native programs, and _imaginary space_ for that of the µVM. _Object pinning_ (or _realising_) is an operation that temporarily makes a memory location in the imaginary space real so that it can be access form C programs.
# Proposal
## Concepts
- **memory**: self-explanatory, but... I don't trust "common sense".
- **memory location**: a region of data storage. Holds values.
- **virtual memory space**: the abstraction provided by the OS and the architecture. It has the following properties:
  - At any moment, it is a mapping from addresses (a subset of integers) to byte values. (I don't like this property. For any multi-threaded program, different threads may not see the same value, and Albert Einstein does not like "the same time".)
  - It can be accessed (read/written/atomicRMW) in various granularities (sizes). The atomicity and visibility between threads follows a certain memory model (the one defined by the architecture, OS and related programming languages).
  - It may be shared between processes and threads. Thus it can be accessed by things not in the µVM.
- **real memory**: memory in which memory locations satisfy the following properties:
  - (Does not need to have "addresses", that is, a memory location can be a variable, not numerical value.)
  - Allows memory accesses (load/store/atomicRMW).
  - For every memory location L, there is a unique memory location L' in the virtual memory space. (This disallows replication.) This L' does not change during the lifetime of L. (This disallows moving.) Accessing of both locations are equivalent.
  - For any two memory locations L1 and L2, their corresponding memory locations in the virtual memory space do not overlap. That is, their accesses are independent. (This disallows aliasing.)
  - For an array in a real memory, its corresponding memory location in the virtual memory space is contiguous. (This disallows implementing arrays as multiple disjoint sub-arrays.)
- **imaginary memory**: memory in which memory locations satisfy the following properties:
  - (Does not need to have "addresses", that is, a memory location can be a variable, not numerical value.)
  - Allows memory accesses (load/store/atomicRMW).

NOTE: As can be seen, "real memory" is trivially "imaginary memory".
- **realising**: temporarily letting a memory location in an imaginary memory have the property of real memory. (This is colloquially called **object pinning**, but it is more than "not moving").
- `iref<T>` (**internal reference**): refer to a memory location in real or imaginary memory.
- `ptr<T>` (**pointer**): an address. May or may correspond to a memory location in the real memory.
## In the µVM
- All memory in the µVM (heap, stack and global) are imaginary memory.
- Introduce the pointer type `ptr<T>`. It is just a raw address, but is typed.
- Introduce the `PTRCAST` instruction which can freely cast `ptr<T>` to or from `int<n>` if n is the appropriate size.
- `LOAD`, `STORE`, `CMPXCHG`, `ATOMICRMW` now work with both `iref<T>` and `ptr<T>`.
- The `CCALL` can call a C function.
  - Plan A: The callee can have type `int<n>`. It is just an integer address.
  - Plan B: Introduce a `c_func<sig>` type. It is castable to/from `int<n>`. NOTE: `func<sig>` refers to µVM functions.
## Pinning
- "Pinning a memory location" means "realising" it, granting it the property of real memory.
- Implicit pinning: Any `iref<T>` values used as arguments of `CCALL` are implicitly pinned during this call.
- Explicit pinning:
  - Plan A: Introduce `REALISE` and `UNREALISE` instructions. Do as it means. The `REALISE` instruction returns a `ptr<T>` value.
  - Plan B: `REALISE` and `UNREALISE` have counting semantics. An object is "unpinned" if its pin-count reduces to 0.
  - Plan C: (the tracing approach) Introduce a type `pinner_iref<T>` which actually holds an `iref<T>` (a [marked storage type](https://github.com/microvm/microvm-spec/wiki/type-system#types-and-type-constructors) of `iref`). `pinner_iref<T>` must be in the memory (not SSA, just like `weakref<T>` cannot be SSA variable). If such a reference is reachable, the referent is pinned. After pinning, the pointer can be obtained via a `GETPTR` instruction. (Plan C does not address replication and non-contiguous arrays)
## Open questions
1. Do we assume stacks and globals as "real" by default?
2. If stacks can move, how do we efficiently realise (pin) it?
3. Do we prevent non-contiguous arrays?
4. How to implement temporary "un-replicating".
# Background: Inter-language interaction

Currently the only way for the µVM to interact with the "outside world" is via traps handled by the client. This interface is called **µVM-client Interface** or **The API**.

For performance concerns, we should introduce a more direct and low-level interface to the "outside world". This new interface is called **foreign function interface** or **FFI**.
## Two worlds

**Imaginary memory**: In a world with advanced garbage collectors, the memory is managed by the GC.
- A high-level memory location (in object or not, for example, if a VM implements movable stacks) may be moved from one address to another (address is the operating system or architecture's virtual address space).
- A high-level memory location It can be replicated (a single high-level object/field corresponds to multiple system memory addresses). This may have different purposes, for example, concurrent GC, security, etc.
- A high-level memory data structure may not have the same structure of the system-level memory. For example, a high-level array may be implemented as segments of (non-contiguous) arrays.
- Programs written in C can only access this kind of memory assisted by the memory manager (GC).
- Example: Java, µVM.

**Real memory**: In a world closely interacting with C, the GC is somewhat naive, or there is no GC at all.
- High-level memory locations (as seen by the programming languages (like C) or VMs (like CPython)) do not move and are not replicated. Each high-level memory location (in object or not) corresponds to exactly one OS/architecture-level address as long as it is not deallocated.
- Programs written in C can directly access the memory as long as it has a raw pointer to the memory location.
- Example:
  - Any non-GC language: C, C++, Rust, ...
  - Any language/impl that tightly interacts with C: CPython, Lua (partially)
## Examples
- The µVM uses "imaginary memory". It does not assume any low-level memory layout except some high-level rules.
- Java exclusively use "imaginary memory". All Java memory accesses through JNI must go though handles. It is even a problem to expose an array to the C language: 1) the VM must support object pinning, and 2) the VM must implement arrays contiguously.
- CPython uses "real memory". C programs hold any Python objects by raw pointers. A C module can customise its own Python object layout to include its own private data.
- Lua uses "real memory". "Userdata" (a chunk of memory allocated by Lua, but is used by the user, like a managed "malloc") is a Lua object. `lua_touserdata` gets a raw pointer to such a chunk of memory and does not need pinning. `lua_topointer` gets a raw pointer to any Lua object (for debugging purpose).
- SpiderMonkey uses something hybrid. Its GC can move objects, but not within a "request" (a delimited region in C programs where GC "must not happen"). In a "request" (probably everything in C that interacts with SpiderMonkey, the C program can use raw pointers to refer to JS objects, though their structures are opaque, and it is recommended to use `JSHandleValue` to mark it as a GC root.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Object Pinning #28

TL;DR

Operations

The guarantees

Issues about multi-threading

Abstract

Proposal

Concepts

In the µVM

Pinning

Open questions

Background: Inter-language interaction

Two worlds

Examples

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Object Pinning #28

Description

TL;DR

Operations

The guarantees

Issues about multi-threading

Abstract

Proposal

Concepts

In the µVM

Pinning

Open questions

Background: Inter-language interaction

Two worlds

Examples

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions