Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove cast_ref_to_mut related to FreeList, PageResource and VMMap #853

Open
1 of 3 tasks
wks opened this issue Jun 8, 2023 · 0 comments
Open
1 of 3 tasks

Remove cast_ref_to_mut related to FreeList, PageResource and VMMap #853

wks opened this issue Jun 8, 2023 · 0 comments
Assignees
Labels
A-meta Area: Meta issues for the repository P-normal Priority: Normal.

Comments

@wks
Copy link
Collaborator

wks commented Jun 8, 2023

parent: #850

Task list:

  • Refactor FreeList and/or use FreeList in a safe way
  • Fix the unsafe interaction between FreeListPageResource and VMMap.
  • Fix the unsafe operations related to VMMap, or introduce a more graceful way to allocate and manage chunks.

Problem

FreeList

FreeList is not thread-safe

The FreeList implementation is ported from JikesRVM. In JikesRVM, the GenericFreeList base class provided a free list implementation based on an abstract array of int (i32 in Rust), but is agnostic to how the array is allocated. It has the abstract methods getEntry and setEntry, and concrete table implementations, including RawMemoryFreeList and IntArrayFreeList, implement those methods to access the underlying array. Particularly, in IntArrayFreeList, the freelist is backed by an int[].

Primitive FreeList operations are not thread-safe. GenericFreeList often separately updates the "low" and "high" bit-fields of each int (32-bit integer) Complex operations (such as splitting a region, or merging two adjacent regions) take multiple updates to complete, and they never use any atomic read-modify-write operations.

In JikesRVM, FreeListPageResource.allocPages holds the PageResource.lock; FreeListPageResource.freePage is only executed by one thread. This suggest that the FreeList itself is not thread-safe, but it is used with external synchronisation.

Solution: FreeList should be wrapped in a Mutex. Actually CommonFreeListPageResource.free_list is always accessed when holding a mutex (FreeListPageResource::sync). This will eliminate the need to call table_mut because a thread always gets &mut FreeList when it acquires the Mutex.

The PR #925 moves CommonFreeListPageResource into sync.

Two-level FreeList structure

The IntArrayFreeList has a two level design where children share the same underlying array with the parent. The parent free list has multiple "heads" (1 + NUM_OF_CHILDREN). The parent uses one of the heads, and each child uses one head.

If I read the JikesRVM source code correctly, the "parent head" is owned by the global Map32 instance, and is protected by Map32.lock. Each "child head" is owned by a FreeListPageResource, and is protected by PageResource.lock. A page resource may acquire pages from the global Map32 instance, holding Map32.lock. This will poll FreeList entries from the doubly-linked list of the "parent head". Then it will insert those entries into the PageResource's "child head", while holding PageResource.lock. In this way, the single global array can contain multiple doubly-linked lists starting with each head, and different doubly-linked lists have no overlapping entries. Therefore, it is safe for two different threads to access two different doubly-linked lists using two different heads, even though their entries are allocated in one single contiguous global array.

Solution: Use unsafe to access the underlying int array if we know different heads lead to non-overlapping linked lists. The implementer of IntArrayFreeList is responsible for guaranteeing the safety.

Alternative solution: Abandon the current FreeList design, and design an Immix-like page allocator for better locality.

PageResource

Determining the start address of spaces that include RawMemoryFreeList

(Update: Fixed in #953)

On 64-bit systems, page resources use RawMemoryFreeList. A RawMemoryFreeList will occupy a region of address at the start of the space it manages. Therefore, the region of memory available for a space to allocate objects starts after the region the RawMemoryFreeList occupies. The size of the RawMemoryFreeList can be determined when it is created.

However, the current implementation is overly designed so that it postpones the calculation of the start of the space (the address after the RawMemoryFreeList) after all spaces are created. This is unnecessary. The current implementation records address of FreeList instances in Map64::{fl_page_resources,fl_map} and visit it later, which involves unsafe and cast_ref_to_mut.

Solution: The address can be eagerly computed if the FreeList implementation is RawMemoryFreeList, but the actual FreeList implementation may be RawMemoryFreeList, and may also be IntArrayFreeList depending on VMMap. We can let FreeList report an Option<Address> so that it can report its limit eagerly if possible right after creation.

PR #925 (closed) uses this approach to eliminate those global maps and unsafe statements.

PR #953 (merged) lets create_freelist return a CreateFreeListResult { free_list, space_displacement } where the space_displacement reports the number of bytes to be added to the starting address.

Determining the start address of disjoint spaces

(Update: Fixed in #953)

On 32-bit systems, some spaces are contiguous, while other spaces dis-contiguous. When creating a plan, contiguous spaces will reserve address spaces from the bottom or the top of the heap address region. After that, all dis-contiguous spaces will share the remaining address region. This means the start address of the dis-contiguous region cannot be determined until the range of all contiguous spaces are allocated.

The current implementation uses a hack. It registers all CommonFreeListPageResource instances in a global list (Map32::shared_fl_map) so that after the plan is created, Map32::finalize_static_space_map will update the start field of all those CommonFreeListPageResource structs. This is unsafe because each CommonFreeListPageResource is referred mutably by both Map32::shared_fl_map and FreeListPageResource.

Solution: The actual requirement is the ability to visit all FreeListPageResource instances after creating plan, and update their start fields. We can just let Space own PageResource, and we enumerate PageResource by enumerating spaces of the plan. When the plan is just created, we have a unique mutable reference to it. So we have the chance to safely mutate it. We just need to enumerate all spaces and their underlying PageResource, which is possible.

PR #925 (closed) uses this approach to eliminate those global maps and unsafe statements.

PR #934 (merged) introduced a way to enumerate spaces in a plan.

PR #953 (merged) uses that method to enumerate space and update the start field of the page resources of disjoint spaces.

Looking up RawMemoryFreeList from address

(Update: Fixed in #953)

On 64-bit systems, when a space needs to be expanded, it asks Map64 to map more memory. In Map64::allocate_contiguous_chunks, it will bump the high water mark, and, if the page resource is using RawMemoryFreeList, attempt to update the RawMemoryFreeList, too. The problem is, Map64 doesn't know if the current page resource is actually a FreeListPageResource and if it is actually using RawMemoryFreeList. The current code uses a global table Map64::fl_map to look up RawMemoryFreeList from address. The map Map64::fl_map itself is unsafe because it holds another reference to RawMemoryFreeList while FreeListPageResource is already holding a reference to it.

Solution: We can let the caller pass in a &mut to the RawMemoryFreeList instance it is using. This is possible because the caller is a FreeListPageResource. We just need to downcast the Box<dyn FreeList> to see if it actually points to a RawMemoryFreeList. If not, it doesn't need to be updated, either.

The PR #925 (closed) lets Map64::allocate_contiguous_chunks take an extra Option<&mut RawMemoryFreeList> parameter so that it can update the RawMemoryFreeList.

PR #953 (merged) lets the PageResource pass a Option<dyn &mut FreeList> (if using free list) to VMMap::allocate_contiguous_chunk so that Map64 can update it.

VMMap

Data structures for managing chunks for different spaces

Map32 implements a "free-list-based discontiguous chunk allocator". Multiple discontiguous spaces can grab chunk from a freelist (region_map) which manages the shared region. And Map32 keeps track of which space owns which chunk (descriptor_map), and links "contiguous chunks" together using a linked list (prev_link and next_link).

Map64 implements a "monotonic contiguous chunk allocator". For each space, Map64 keeps track of the lower bound and the upper bound of the region of memory the space occupies (base_address and high_water).

Among those data structures:

  • In Map32:
    • descriptor_map: Vec<SpaceDescriptor> is per-chunk
    • prev_link: Vec<i32> is per-chunk
    • next_link: Vec<i32> is per-chunk
    • region_map: IntArrayFreeList is also per-chunk
    • global_page_map: IntArrayFreeList is per-page
  • In Map64:
    • descriptor_map: Vec<SpaceDescriptor> is per-space
    • base_address: Vec<Address> is per-space
    • high_water: Vec<Address> is per-space

It is obvious that those data structures manage resources in a centralised way, but those resources may be distributed to different spaces, and will be accessed and modified by different spaces concurrently. The safety of those data structures lie in the fact that different spaces own different portions of those vectors. This is not an idiomatic Rust use pattern, and may need the unsafe keyword.

For Map32, unallocated chunks are effectively owned by the Map32 itself (more precisely, by the region_map, a free list that records free chunks). I think region_map requires a Mutex, while other data structures, such as descriptor_map and {prev,next}_link can be accessed without synchronisation by the space that owns the region.

For Map64, base_address and high_water are effectively pre-space metadata. It may be possible to distribute those vectors into Space or PageResource instances.

Solution: (1) Add mutex to region_map, (2) Use unsafe to access other vectors in Map32, (3) distribute the per-space metadata (base address and high water) to Space, PageResource, or other per-space data structures.

Progress

#893 (merged) eliminates all cast_ref_to_mut use cases by using UnsafeCell. It has been merged, so mmtk-core will compile on the latest stable Rust version.

#925 (closed) tries to refactor the related code to eliminate the need to use unsafe code. Currently it is removing many of the cases mentioned above, but not all.

Fixed in #953 removed shared_fl_map from Map32, and fl_map and fl_page_resource from Map64. It removed the unsafe operations related to FreeListPageResource.

@wks wks changed the title Remove cast_ref_to_mut related to FreeList and PageResource Remove cast_ref_to_mut related to FreeList, PageResource and VMMap Aug 29, 2023
@k-sareen k-sareen added A-meta Area: Meta issues for the repository P-normal Priority: Normal. labels Nov 6, 2023
github-merge-queue bot pushed a commit that referenced this issue May 10, 2024
This PR removes the `shared_fl_map` field from `Map32`, and removes
`fl_map` and `fl_page_resource` fields from `Map64`. These global maps
were used to enumerate and update the starting address of spaces (stored
in `CommonFreeListPageResource` instances), for several different
purposes.

1. `Map32` can only decide the starting address of discontiguous spaces
after all contiguous spaces are placed, therefore it needs to modify the
start address of all discontiguous `FreeListPageResource` instances.
- `shared_fl_map` was used to locate `CommonFreeListPageResource`
instances by the space ordinal.
- With this PR, we use `Plan::for_each_space_mut` to enumerate spaces,
and use the newly added `Space::maybe_get_page_resource_mut` method to
get the page resource (if it has one) in order to update their starting
addresses.
2. `Map64` allocates `RawMemoryFreeList` in the beginning of each space
that uses `FreeListPageResource`. It needs to update the start address
of each space to take the address range occupied by `RawMemoryFreeList`
into account.
- `fl_page_resource` was used to locate `CommonFreeListPageResource`
instances by the space ordinal.
- `fl_map` was used to locate the underlying `RawMemoryFreeList`
instances of the corresponding `FreeListPageResource` by the space
ordinal.
- With this PR, we initialize the start address of the
`FreeListPageResource` immediately after a `RawMemoryFreeList` is
created, taking the limit of `RawMemoryFreeList` into account.
Therefore, it is no longer needed to update the starting address later.

Because those global maps are removed, several changes are made to
`VMMap` and its implementations `Map32` and `Map64`.

- The `VMMap::boot` and the `VMMap::bind_freelist` no longer serve any
purpose, and are removed.
- `Map64::finalize_static_space_map` now does nothing but setting
`finalized` to true.

The `CommonFreeListPageResource` data structure was introduced for the
sole purpose to be referenced by those global maps. This PR removes
`CommonFreeListPageResource` and `FreeListPageResourceInner`, and
relocates their fields.

- `common: CommonPageResource` is now a field of `FreeListPageResource`.
- The `free_list: Box<dyn FreeList>` and `start: Address` are moved into
`FreeListPageResourceSync` because they have always been accessed while
holding the mutex `FreeListPageResource::sync`.
- Note that the method `FreeListPageResource::release_pages` used to
call `free_list.size()` without holding the `sync` mutex, and
subsequently calls `free_list.free()` with the `sync` mutex. The
algorithm of `FreeList` guarantees `free()` will not race with `size()`.
But the `Mutex` forbids calling the read-only operation `free()` without
holding the lock because in general it is possible that a read-write
operations may race with read-only operations, too. For now, we extended
the range of the mutex to cover the whole function.

After the changes above, the `FreeListPageResource` no longer needs
`UnsafeCell`.

This PR is one step of the greater goal of removing unsafe operations
related to FreeList, PageResource and VMMap. See:
#853

Related PRs:
- #925: This PR is a subset of
that PR.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-meta Area: Meta issues for the repository P-normal Priority: Normal.
Projects
No open projects
Status: 🏗 In progress
Status: 🔖 Ready
Development

No branches or pull requests

2 participants