Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize TypeEngine for garbage collection #6603

Open
ironcev opened this issue Oct 3, 2024 · 2 comments
Open

Optimize TypeEngine for garbage collection #6603

ironcev opened this issue Oct 3, 2024 · 2 comments
Assignees
Labels
compiler: frontend Everything to do with type checking, control flow analysis, and everything between parsing and IRgen compiler General compiler. Should eventually become more specific as the issue is triaged team:compiler Compiler Team

Comments

@ironcev
Copy link
Member

ironcev commented Oct 3, 2024

NOTE: This is the draft version of the proposal. It lists the three major points that a "garbage-collection-friendly" TypeEngine should support, but still not in enough detail. The concrete architectural proposal is also missing, as well as the results of measurements from a real-world project that support the reasoning behind the proposal.

The numbers given in the below description are coming from the compilation of the Spark Orderbook workspace, a realistic real-world project.

The solution will also solve #6687.

Shareable TypeInfos and their lifetime

We use the term shareable type as defined in #6613.

Examples of shareable types are all built-in types like, e.g., u64, but also types like, e.g., Option<u64> or MyStruct or MyGenericStruct<bool>.

The number of those types is orders of magnitude smaller then the overall number of types inside of the TypeEngine, and they can be heavily reused across all the projects and modules, which is currently not the case. (TODO: Add data from measurements.)

Currently, the TypeEngine optimized in #6613 reuses the TypeSourceInfos of shareable TypeInfos and not the TypeInfos themselves. This is sub-optimal because shareable types that cannot be changed in between garbage collections are still assigned a source_id and are being garbage collected. This table shows the content of the shareable_types hash map. It is evident that types like !, (), bool, etc are stored unnecessarily many times.

Nevers              :       35
String slices       :        8
String arrays       :       11
Unsigned integers   :      182
Enums               :      287
Structs             :      276
Booleans            :       26
Units               :       79
Tuples              :       12
B256s               :       47
Contracts           :        1
Error recoveries    :        2
Arrays              :       25
Raw pointers        :       31
Raw slices          :       21
Trait types         :        4

Those types should "live forever" and never be garbage collected.

Also, the shareable types like Option<u64> should be shareable across modules, means for different source_ids we should still point to a single shared Option<u64> instance. Note that this instance should be GCed if the definition of Option gets changed.

Distribution of TypeInfos across source_ids

TypeInfos that should get garbage collected should be distributed towards the leafs of project and module dependency tree. This means, the types should have assigned, if possible of course, source_ids of the files that are likely to be changed, and those are always the files in the projects that developers actually work on (the leafs), and not the files in the dependencies, especially the standard library dependencies.

E.g., currently, if we have fn my_function<A>() -> Option<A> in the code we are editing (a leaf!), the non-shareable Option<A> type will get assigned the source_id of the original Option declaration, the one from the standard library. Which means it will never be GCed, although it should be GCed whenever we GC the current module the programmer is changing.

E.g., currently, all of the TypeInfo::Unknown types do not have source_id assigned, and not all of them get replace()ed in the engine. Out of ~41.000 inserted Unknowns, some ~4.500 remain unreplaced, and, not having source_id assigned, can never be GCed. This produces a constant small "memory leak" within the TypeEngine.

Time-performant garbage collection

TypeEngine should, ideally, offer an O(1) removal of files during the GC.

E.g., before the optimization done in #6613, the ConcurrentSlab use to hold ~510.000 elements. All those elements needed to be traversed during GC, to remove just a handful of types. After the optimization, the slab contains ~230.000 elements which speeds up the traversal, but sill has an O(n) complexity.

Architectural proposal

TODO: Write proposal.

@ironcev ironcev added compiler General compiler. Should eventually become more specific as the issue is triaged compiler: frontend Everything to do with type checking, control flow analysis, and everything between parsing and IRgen labels Oct 3, 2024
@ironcev ironcev self-assigned this Oct 3, 2024
@tritao
Copy link
Contributor

tritao commented Oct 4, 2024

  1. A GC-friendly TypeEngine should never garbage-collect shareable types. The number of those types is orders of magnitude smaller then the overall number of types inside of the TypeEngine, and they can be heavily reused across all the projects and modules. Shareable types should "live forever" and never be GCed, and thus, never be assigned to any particular source_id. Examples of shareable types are all built-in types like, e.g., u64, but also types like, e.g., Option<u64> or MyStruct or MyGenericStruct<bool>.

The idea of shareable types is interesting and not something I really considered before outside the builtin core types.
Could you maybe expand on this concept of a shareable type? Would they be std or just core types maybe?
And would the main difference be they don't contain a source id? I am wondering if that could impact some diagnostic messages which might be pointing to some core types.

Also I get a bit unsure when you say that they should never be GC'd. First lets say we consider all std types to be shareable and thus not considered for GC. But what would happen if lets say, we change something in the std, and recompile a project that has been cached (and thus referencing the old std cached type ids). In that case, we would need to GC the old std types out of the type engine right?

3. A GC-friendly TypeEngine should, ideally, offer an O(1) removal of files during the GC.

A potential idea me and @JoshuaBatty previously discussed before would be the possibility of having separate per-module engines, an id would have a reserved space to contain the engine id. This would allow a O(1) removal of the engine data when the module is collected.

@ironcev
Copy link
Member Author

ironcev commented Oct 4, 2024

Thanks for the early feedback and excellent questions @tritao! This issue description indeed requires more elaboration including the proper definitions and more concrete examples of measurements that justify proposed design. This first issue description is primarily created so that I can link to it in several TODOs left in the code that call for future changes in the TypeEngine. I'll elaborate more on all points including examples as well as explaining concepts in detail. After your question, I see that the description in the first point is pretty misleading. Also, it is important to add how it all works in sync with the DeclEngine during GC. And for the third point, how a solution could look like without having separate engines. Thanks a lot for the feedback and questions and stay tuned 😄

tritao added a commit that referenced this issue Nov 6, 2024
## Description

This PR implements a major rewrite of the `TypeEngine`. The rewrite:
- significantly improves overall compilation performance. The
compilation time of the real-world [Spark Orderbook
workspace](https://github.com/compolabs/orderbook-contract) got
**reduced from 8.99 to 5.94 seconds**, and the memory consumption from
**3.32 to 1.78 GB**.
- provides robust API for inserting types, related to the usage of the
`source_id` (see: #5991). The new API forbids not providing `source_id`s
or providing semantically questionable or non-optimal ones.
- introduces simpler, much less verbose, API for inserting types.

The PR also removes the obsolete and unused `TypeInfo::Storage`.

The PR **does not address** the following:
- The `TypeEngine`'s "garbage-collection-(un)friendlines" (see: #6603).
- The handling of the `TypeInfo::Custom` and `TypeInfo::TraitType` types
within the `TypeEngine` (see: #6601).
- The number of interactions with the `TypeEngine`. E.g., removing
unnecessary insertions after `resolve()`ing or `monomorphyze()`ing types
will be done as a part of `DeclEngine` optimization.

Closes #5991.

## Shareable types

The PR formalizes the notion of a _shareable type_ within the
`TypeEngine` (strictly speaking, a shareable `TypeInfo`). A shareable
type is a type that is both:
- _unchangeable_: it, or any of its parts, cannot be replaced during the
unification or monomorphization.
- and _undistinguishable by annotations_: it doesn't carry any
additional information (annotations) that could differ it from another
instance of `TypeInfo` that would, purely from the type perspective, be
a same type.

E.g., `u64` or `MyStruct<u64, bool>` are unchangeable while `Numeric`,
`Unknown` or `MyStruct<T1, T1>` are changeable.

E.g., in this example, `a` and `b` have the same type, `[u64; 2]` but
those two types differ in spans assigned to the "u64"s and the "2"s, and
are treated as different within the `TypeEngine`, and thus as
non-shareable.

```
let a: [u64; 2] = [1, 1];
let b: [u64; 2] = [2, 2];
```

Shareability of a type is crucial for reusing the `TypeSourceInfo`
instances within the type engine. Shareable types can be given different
`TypeId`s without the need to newly allocate a `TypeSourceInfo`.

## Performance improvements

The cummulative effect of the performance improvements on the
compilataion of the real-world [Spark Orderbook
workspace](https://github.com/compolabs/orderbook-contract) is given
below. The compilation means the frontend compilation, up to the IR
generation (`forc check`).

**Compilation time**
```
Before:
  Time (mean ± σ):      8.995 s ±  0.297 s    [User: 7.126 s, System: 1.289 s]
  Range (min … max):    8.675 s …  9.262 s    3 runs

After:
  Time (mean ± σ):      5.945 s ±  0.517 s    [User: 4.749 s, System: 0.768 s]
  Range (min … max):    5.349 s …  6.280 s    3 runs
```

**Memory consumption**
```
---------------------------------------------------------
                total(B)   useful-heap(B)   extra-heap(B)
---------------------------------------------------------
Before:    3,316,786,808    3,237,317,383      79,469,425
After:     1,784,467,376    1,743,772,406      40,694,970
```

### Applied optimizations:

- **Replacement of expensive `insert` calls with compile time constants
for built-in types.** Built-in types like `!`, `bool`, `()`, `u8`, etc.
are inserted into the engine at its creation at predefined slots within
the `slab`. The `id_of_<built-in-type>` methods just return those
predefined `TypeId`s, effectively being compiled down to constants. The
calls like `type_engine.insert(engines, TypeInfo::Boolean, None)` are
replaced with maximally optimized and non-verbose
`type_engine.id_of_bool()`.

- **Elimination of extensive creation of `TypeSourceInfo`s for
`TypeInfo::Unknown/Numeric`s.** `Unknown` and `Numeric` are inserted
into the engine ~50.000 times. Each insert used to create a new instance
of `TypeSourceInfo` with the `source_id` set to `None`. The optimization
replaces those ~50.000 instances with two predefined singleton
instances, one for `Unknown` + `None` and one for `Numeric` + `None`.
(Note that when implementing #6603, we will want to bind also `Unknown`s
and `Numeric`s to `source_id`s different then `None`, but we will still
want to reuse the `TypeInfo` instances.)

- **Elimination of extensive temporary heap-allocations during hash
calculation.** The custom hasher obtained by `make_hasher` required a
`TypeSourceInfo` to calculate the hash and it was called every time the
`insert` was called, ~530.000 times. Getting the `TypeSourceInfo`
originally required cloning the `TypeInfo`, which depending on the
concrete `TypeInfo` instance could cause heap allocations, and also
heap-allocating that copy within an `Arc`. Hash was calculated
regardless of the possibility for the type to be stored in the hash map
of reusable types. The optimization removed the hashing step if the type
is not shareable, removed the cloning of the `TypeInfo` and introduced a
custom `compute_hash_without_heap_allocation` method that produced the
same hash as the `make_hasher` but without unnecessary temporary
heap-allocations of `TypeSourceInfo`s.

- **Replacement of `TypeSourceInfo`s within the hash map with
`Arc<TypeSourceInfo>`s.** The hash map unnecessarily required making
copies of reused `TypeSourceInfo`s.

- **Introducing the concept of a _shareable type_ and rolling it out for
all types.** Previously, the engine checked only for changeability of
types by using the `TypeInfo::is_changeable` method. The implementation
of that method was extremely simplified, e.g. treating all structs and
enums with generic arguments as changeable even if they were fully
monomorphized. This resulted in over-bloating the engine with type
instances that were actually unchangeable, but considered to be
changeable. Also, strictly seen, the unchangeability (during unification
and monomorphization) is not the only necessary criteria for reuse (or
sharing) a type within the engine. Another important aspect is, as
explained above, the _differentiability by annotations_. The PR
introduces the notion of a _shareable type_ which is both unchangeable
and not differentiable by annotations. The optimization takes advantage
of such types by storing them only once per `source_id`.

- **Elimination of extensive unnecessary inserts of new types during
`replace()` calls.** When `replace()`ing types during unifications, a
new `TypeSourceInfo` instance was created for every replacement, ~46.000
times. This meant a heap-allocation of the `TypeSourceInfo` (16 bytes)
and the `Arc`-contained `TypeInfo` (232 bytes), even if the replaced
type was shareable and already available in the type engine. The
optimization now reuses an already existing shareable type if available.

## Robustness related to `source_id`s

The issues we had with properly providing the `source_id` in the
`insert` method are explained in #5991. This PR removes the `source_id`
from the new public API and calculates it internally within the engine,
based on the type being inserted. This makes inserting of types both
more robust and less verbose and eliminates the possibility of providing
a semantically wrong `source_id`.

Note that the calculation of an optimal `source_id` done within the
engine fully corresponds to the "calculations" we currently have at call
sites.

E.g., when inserting enums, previously we always had to write
`type_engine.insert(engines, TypeInfo::Enum(decl_id),
enum_decl.span.source_id())`. Fetching the `source_id` from the enum
declaration is now done within the `insert_enum` method:
`type_engine.insert_enum(engines, decl_id)`.

Note that for certain types we will want to change the current behavior
and either provide a `source_id` or pick a more suitable one. E.g, even
when inserting `Unknown`s, we will want to have `source_id`s, if
possible. This will be done in #6603, but again, together with providing
a robust API that will be difficult to misuse.

## Simplicity

As already mentioned in some of the examples above, the new
`id_of_<type>()` and `insert_<type>` methods provide much simpler and
less verbose API to use. Introducing those methods remove a lot of
boilerplate code from the callers. Also, the individual `insert_<type>`
methods are additionally optimized for inserting the particular type.

The common `insert` method is intended to be used only in cases where
the inserted `TypeInfo` is not known at the call site.

Here are some examples of the new API versus the existing one:
|Before|After|
|------|-----|
| `type_engine.insert(engines, TypeInfo::Tuple(vec![]), None)` |
`type_engine.id_of_unit()` |
| `type_engine.insert(engines, TypeInfo::Unknown, None)` |
`type_engine.new_unknown()` |
| `type_engine.insert(engines,
TypeInfo::UnsignedInteger(IntegerBits::SixtyFour), None)` |
`type_engine.id_of_u64()` |

For more complex types, like, e.g., `TypeInfo::Tuple`, the difference in
inserting is even more prominent, as can be seen from the diffs of
numerous code lines deleted in this PR.

## Checklist

- [x] I have linked to any relevant issues.
- [x] I have commented my code, particularly in hard-to-understand
areas.
- [ ] I have updated the documentation where relevant (API docs, the
reference, and the Sway book).
- [ ] If my change requires substantial documentation changes, I have
[requested support from the DevRel
team](https://github.com/FuelLabs/devrel-requests/issues/new/choose)
- [x] I have added tests that prove my fix is effective or that my
feature works.
- [ ] I have added (or requested a maintainer to add) the necessary
`Breaking*` or `New Feature` labels where relevant.
- [x] I have done my best to ensure that my PR adheres to [the Fuel Labs
Code Review
Standards](https://github.com/FuelLabs/rfcs/blob/master/text/code-standards/external-contributors.md).
- [x] I have requested a review from the relevant team or maintainers.

---------

Co-authored-by: Joshua Batty <joshpbatty@gmail.com>
Co-authored-by: João Matos <joao@tritao.eu>
@IGI-111 IGI-111 added the team:compiler Compiler Team label Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler: frontend Everything to do with type checking, control flow analysis, and everything between parsing and IRgen compiler General compiler. Should eventually become more specific as the issue is triaged team:compiler Compiler Team
Projects
None yet
Development

No branches or pull requests

3 participants