Rewrite `TypeEngine` for performance, robustness, and simplicity #6613

ironcev · 2024-10-06T17:55:44Z

Description

This PR implements a major rewrite of the TypeEngine. The rewrite:

significantly improves overall compilation performance. The compilation time of the real-world Spark Orderbook workspace got reduced from 8.99 to 5.94 seconds, and the memory consumption from 3.32 to 1.78 GB.
provides robust API for inserting types, related to the usage of the source_id (see: Remove error-prone source_id from TypeEngine::insert() and make inserting API more robust and less verbose #5991). The new API forbids not providing source_ids or providing semantically questionable or non-optimal ones.
introduces simpler, much less verbose, API for inserting types.

The PR also removes the obsolete and unused TypeInfo::Storage.

The PR does not address the following:

The TypeEngine's "garbage-collection-(un)friendlines" (see: Optimize TypeEngine for garbage collection #6603).
The handling of the TypeInfo::Custom and TypeInfo::TraitType types within the TypeEngine (see: Improve handling of TypeInfo::Custom and TypeInfo::TraitType within the TypeEngine #6601).
The number of interactions with the TypeEngine. E.g., removing unnecessary insertions after resolve()ing or monomorphyze()ing types will be done as a part of DeclEngine optimization.

Closes #5991.

Shareable types

The PR formalizes the notion of a shareable type within the TypeEngine (strictly speaking, a shareable TypeInfo). A shareable type is a type that is both:

unchangeable: it, or any of its parts, cannot be replaced during the unification or monomorphization.
and undistinguishable by annotations: it doesn't carry any additional information (annotations) that could differ it from another instance of TypeInfo that would, purely from the type perspective, be a same type.

E.g., u64 or MyStruct<u64, bool> are unchangeable while Numeric, Unknown or MyStruct<T1, T1> are changeable.

E.g., in this example, a and b have the same type, [u64; 2] but those two types differ in spans assigned to the "u64"s and the "2"s, and are treated as different within the TypeEngine, and thus as non-shareable.

let a: [u64; 2] = [1, 1];
let b: [u64; 2] = [2, 2];

Shareability of a type is crucial for reusing the TypeSourceInfo instances within the type engine. Shareable types can be given different TypeIds without the need to newly allocate a TypeSourceInfo.

Performance improvements

The cummulative effect of the performance improvements on the compilataion of the real-world Spark Orderbook workspace is given below. The compilation means the frontend compilation, up to the IR generation (forc check).

Compilation time

Before:
  Time (mean ± σ):      8.995 s ±  0.297 s    [User: 7.126 s, System: 1.289 s]
  Range (min … max):    8.675 s …  9.262 s    3 runs

After:
  Time (mean ± σ):      5.945 s ±  0.517 s    [User: 4.749 s, System: 0.768 s]
  Range (min … max):    5.349 s …  6.280 s    3 runs

Memory consumption

---------------------------------------------------------
                total(B)   useful-heap(B)   extra-heap(B)
---------------------------------------------------------
Before:    3,316,786,808    3,237,317,383      79,469,425
After:     1,784,467,376    1,743,772,406      40,694,970

Applied optimizations:

Replacement of expensive insert calls with compile time constants for built-in types. Built-in types like !, bool, (), u8, etc. are inserted into the engine at its creation at predefined slots within the slab. The id_of_<built-in-type> methods just return those predefined TypeIds, effectively being compiled down to constants. The calls like type_engine.insert(engines, TypeInfo::Boolean, None) are replaced with maximally optimized and non-verbose type_engine.id_of_bool().
Elimination of extensive creation of TypeSourceInfos for TypeInfo::Unknown/Numerics. Unknown and Numeric are inserted into the engine ~50.000 times. Each insert used to create a new instance of TypeSourceInfo with the source_id set to None. The optimization replaces those ~50.000 instances with two predefined singleton instances, one for Unknown + None and one for Numeric + None. (Note that when implementing Optimize TypeEngine for garbage collection #6603, we will want to bind also Unknowns and Numerics to source_ids different then None, but we will still want to reuse the TypeInfo instances.)
Elimination of extensive temporary heap-allocations during hash calculation. The custom hasher obtained by make_hasher required a TypeSourceInfo to calculate the hash and it was called every time the insert was called, ~530.000 times. Getting the TypeSourceInfo originally required cloning the TypeInfo, which depending on the concrete TypeInfo instance could cause heap allocations, and also heap-allocating that copy within an Arc. Hash was calculated regardless of the possibility for the type to be stored in the hash map of reusable types. The optimization removed the hashing step if the type is not shareable, removed the cloning of the TypeInfo and introduced a custom compute_hash_without_heap_allocation method that produced the same hash as the make_hasher but without unnecessary temporary heap-allocations of TypeSourceInfos.
Replacement of TypeSourceInfos within the hash map with Arc<TypeSourceInfo>s. The hash map unnecessarily required making copies of reused TypeSourceInfos.
Introducing the concept of a shareable type and rolling it out for all types. Previously, the engine checked only for changeability of types by using the TypeInfo::is_changeable method. The implementation of that method was extremely simplified, e.g. treating all structs and enums with generic arguments as changeable even if they were fully monomorphized. This resulted in over-bloating the engine with type instances that were actually unchangeable, but considered to be changeable. Also, strictly seen, the unchangeability (during unification and monomorphization) is not the only necessary criteria for reuse (or sharing) a type within the engine. Another important aspect is, as explained above, the differentiability by annotations. The PR introduces the notion of a shareable type which is both unchangeable and not differentiable by annotations. The optimization takes advantage of such types by storing them only once per source_id.
Elimination of extensive unnecessary inserts of new types during replace() calls. When replace()ing types during unifications, a new TypeSourceInfo instance was created for every replacement, ~46.000 times. This meant a heap-allocation of the TypeSourceInfo (16 bytes) and the Arc-contained TypeInfo (232 bytes), even if the replaced type was shareable and already available in the type engine. The optimization now reuses an already existing shareable type if available.

Robustness related to `source_id`s

The issues we had with properly providing the source_id in the insert method are explained in #5991. This PR removes the source_id from the new public API and calculates it internally within the engine, based on the type being inserted. This makes inserting of types both more robust and less verbose and eliminates the possibility of providing a semantically wrong source_id.

Note that the calculation of an optimal source_id done within the engine fully corresponds to the "calculations" we currently have at call sites.

E.g., when inserting enums, previously we always had to write type_engine.insert(engines, TypeInfo::Enum(decl_id), enum_decl.span.source_id()). Fetching the source_id from the enum declaration is now done within the insert_enum method: type_engine.insert_enum(engines, decl_id).

Note that for certain types we will want to change the current behavior and either provide a source_id or pick a more suitable one. E.g, even when inserting Unknowns, we will want to have source_ids, if possible. This will be done in #6603, but again, together with providing a robust API that will be difficult to misuse.

Simplicity

As already mentioned in some of the examples above, the new id_of_<type>() and insert_<type> methods provide much simpler and less verbose API to use. Introducing those methods remove a lot of boilerplate code from the callers. Also, the individual insert_<type> methods are additionally optimized for inserting the particular type.

The common insert method is intended to be used only in cases where the inserted TypeInfo is not known at the call site.

Here are some examples of the new API versus the existing one:

Before	After
`type_engine.insert(engines, TypeInfo::Tuple(vec![]), None)`	`type_engine.id_of_unit()`
`type_engine.insert(engines, TypeInfo::Unknown, None)`	`type_engine.new_unknown()`
`type_engine.insert(engines, TypeInfo::UnsignedInteger(IntegerBits::SixtyFour), None)`	`type_engine.id_of_u64()`

For more complex types, like, e.g., TypeInfo::Tuple, the difference in inserting is even more prominent, as can be seen from the diffs of numerous code lines deleted in this PR.

Checklist

I have linked to any relevant issues.
I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation where relevant (API docs, the reference, and the Sway book).
- If my change requires substantial documentation changes, I have requested support from the DevRel team
I have added tests that prove my fix is effective or that my feature works.
I have added (or requested a maintainer to add) the necessary Breaking* or New Feature labels where relevant.
I have done my best to ensure that my PR adheres to the Fuel Labs Code Review Standards.
I have requested a review from the relevant team or maintainers.

codspeed-hq · 2024-10-06T18:11:16Z

CodSpeed Performance Report

Merging #6613 will improve performances by 13.56%

_{Comparing ironcev/rewrite-type-engine (193e06f) with master (16318d3)}

Summary

⚡ 1 improvements
✅ 21 untouched benchmarks

Benchmarks breakdown

	Benchmark	`master`	`ironcev/rewrite-type-engine`	Change
⚡	`compile`	6.1 s	5.4 s	+13.56%

ironcev · 2024-10-06T18:38:46Z

EDIT: The transient bug whose symptoms are described below is now fixed. The issue was in an already existing (long existing 😉) underlying bug in the PartialEqWithEngines implementation for the TypeInfo::Tuple, which is now fixed as a part of this PR.

While working on the improvements I've noticed an E2E test suddenly failing with this error message:

 7 │+  --> sway-lib-core/src/slice.sw:7:24
 8 │+   |
 9 │+ 5 |   impl<T> &__slice[T] {
10 │+ 6 |       pub fn ptr(self) -> raw_ptr {
11 │+ 7 |           let (ptr, _) = asm(s: self) {
12 │+   |  ________________________^
13 │+ 8 | |             s: (raw_ptr, u64)
14 │+ 9 | |         };
15 │+   | |_________^ Mismatched types.
16 │+expected: ()
17 │+found:    (pointer, u64).
18 │+help: Variable declaration's type annotation does not match up with the assigned expression's type.
19 │+10 |           ptr
20 │+11 |       }

Note that the wrongly reported compile error is in core library, means it should affect almost all the tests but it didn't. Indeed, re-running that test didn't reproduce the error.

I've kept re-running the E2E suit on another machine for two days straight (!!) before the error got provoked again, this time in another file:

   --> sway-lib-std/src/math.sw:316:28
    |
314 |           }
315 |   
316 |           let (a, b, c, d) = asm(r1: self) {
    |  ____________________________^
317 | |             r1: (u64, u64, u64, u64)
318 | |         };
    | |_________^ Mismatched types.
expected: ()
found:    (u64, u64, u64, u64).
help: Variable declaration's type annotation does not match up with the assigned expression's type.
319 |           if a != 0 {
320 |               return a.log2().as_u256() + 0xc0u256;

Up to now, I couldn't find a logical error in code, and chasing a non-reproducible issue is difficult indeed. But the issue is for sure still there. After opening this draft PR, the forc-unit-tests failed because of the same reason. It is green now, but the cargo-run-e2e-test-release failed on one test I am sure for the same reason.

It is interesting (and for sure good!), that the tests are failing on the build machine, but when run locally it takes literally tens of thousand of compilations to get the error.

EDIT: All tests are green now, but that was to expect. I'll keep the PR in draft until I am sure that the issue is actually fixed.

sway-core/src/semantic_analysis/ast_node/declaration/impl_trait.rs

Co-authored-by: Joshua Batty <joshpbatty@gmail.com>

sway-core/src/type_system/ast_elements/type_argument.rs

Co-authored-by: João Matos <joao@tritao.eu>

IGI-111

Love the new API.

ironcev · 2024-11-04T15:16:03Z

All the GC issues reported in #6665, are now fixed on this branch. Similarly to the transient bug explained above, the root causes for GC related issues were already existing and the rewrite of the TypeEngine made them occur more frequently.

This PR now fixes, or at least mitigate to an extent, those GC issues which were possible to fix by changing only the TypeEngine internals.

However, GC still has issues whose solving requires restructurings outside of the TypeEngine. E.g. fixing #6687 requires changes in the DeclEngine as well.

The question is now how to proceed with this PR?

My thinking is to merge it as it is now, although knowing that not all of the GC issues are solved. The reasoning behind that is the following:

this version already has more stable GC then the one on the master.
When running GC tests on our whole E2E test suite (both should_pass and should_fail test) only one test fails in this PR (should_pass/stdlib/option_eq/src/main.sw), while numerous tests fail on master.
the option_eq test fails on master as well, proving that this is not a new error caused by the changes in this branch.
I've manually tried LSP from this PR on several representative projects end never experienced a crash. (Of course, should_pass/stdlib/option_eq project will cause LSP crash.)
this PR is already vast in scope and brings significant changes. Extending it to rewrite the DeclEngine and address other GC related topics would be a huge scope-explosion. We originally planned to tackle the DeclEngine and "GC-friendliness" in separate PRs.
keeping this PR on the top of other changes already proved to be a high effort. The PR touches the most prominent files on the frontend and is in constant collision with other changes done on the frontend.

@sdankel @JoshuaBatty @tritao @IGI-111 What do you think about this proposal? Can you also please try the local version of the LSP on your projects to see if you can experience any crashes?

tritao · 2024-11-04T15:20:23Z

Makes sense to me to get this one merged ASAP.

Cargo.toml

ironcev added 2 commits October 6, 2024 13:14

Rewrite TypeEngine for performance, robustness, and simplicity

31225ba

Merge branch 'master' into ironcev/rewrite-type-engine

fd89805

ironcev mentioned this pull request Oct 6, 2024

Optimize TypeEngine for garbage collection #6603

Open

Fix typos

9833d21

ironcev self-assigned this Oct 6, 2024

ironcev added compiler General compiler. Should eventually become more specific as the issue is triaged compiler: frontend Everything to do with type checking, control flow analysis, and everything between parsing and IRgen labels Oct 6, 2024

JoshuaBatty reviewed Oct 7, 2024

View reviewed changes

sway-core/src/semantic_analysis/ast_node/declaration/impl_trait.rs Outdated Show resolved Hide resolved

Fix a typo

0b2c75e

Co-authored-by: Joshua Batty <joshpbatty@gmail.com>

tritao reviewed Oct 7, 2024

View reviewed changes

sway-core/src/type_system/ast_elements/type_argument.rs Outdated Show resolved Hide resolved

ironcev and others added 10 commits October 7, 2024 13:23

Fix a typo

f911823

Co-authored-by: João Matos <joao@tritao.eu>

Merge branch 'master' into ironcev/rewrite-type-engine

fd00f48

Merge branch 'master' into ironcev/rewrite-type-engine

b886335

Use paste macro to smiplify type_engine_shareable_built_in_types

da7c1e8

Merge branch 'master' into ironcev/rewrite-type-engine

544ec48

Merge branch 'master' into ironcev/rewrite-type-engine

a761573

Merge branch 'master' into ironcev/rewrite-type-engine

712d596

Merge branch 'master' into ironcev/rewrite-type-engine

d8a267c

Roll out new TypeEngine methods to merged changes

f04726a

Fix issue in PartialEqWithEngines for `TypeInfo::Tuple

6535e1a

ironcev marked this pull request as ready for review October 22, 2024 14:21

ironcev requested review from a team as code owners October 22, 2024 14:21

ironcev requested review from tritao and JoshuaBatty October 22, 2024 14:21

IGI-111 previously approved these changes Oct 22, 2024

View reviewed changes

IGI-111 requested a review from a team October 22, 2024 15:19

ironcev added 16 commits October 27, 2024 16:52

Merge branch 'master' into ironcev/rewrite-type-engine

bcc4edd

Fix fmt issues

c186bd5

Add TypeEngine::get_type_source_id

f1e0dcc

Replacing types with shareable built-in types

89d9ed8

Adjust LSP garbage collection tests

8270b6a

Document how to try out the local LSP version

e7e7534

Merge branch 'master' into ironcev/rewrite-type-engine

d40b884

Fix a typo

69de266

Fix fmt issues

ae9ee86

Remove new_unknown_generic_self_with_name

2c2879c

Fix fmt issues

5ff3e9c

Remove dbg!s left in server_state in Sway features

89ecb88

Replace te().insert call with te().new_placeholder

438351e

Add module_might_outlive_type

29e309f

Merge branch 'master' into ironcev/rewrite-type-engine

70d7164

Fix clippy and fmt issues

f181281

tritao previously approved these changes Nov 4, 2024

View reviewed changes

ironcev marked this pull request as ready for review November 4, 2024 15:23

JoshuaBatty reviewed Nov 5, 2024

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

ironcev added 2 commits November 5, 2024 18:37

Merge branch 'master' into ironcev/rewrite-type-engine

23d9012

Remove patch version from crate versions

193e06f

ironcev dismissed tritao’s stale review via 193e06f November 5, 2024 17:44

ironcev requested review from tritao and JoshuaBatty November 5, 2024 17:44

tritao approved these changes Nov 5, 2024

View reviewed changes

sdankel approved these changes Nov 6, 2024

View reviewed changes

tritao merged commit 73e1a7a into master Nov 6, 2024
40 checks passed

tritao deleted the ironcev/rewrite-type-engine branch November 6, 2024 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite `TypeEngine` for performance, robustness, and simplicity #6613

Rewrite `TypeEngine` for performance, robustness, and simplicity #6613

ironcev commented Oct 6, 2024 •

edited

Loading

codspeed-hq bot commented Oct 6, 2024 •

edited

Loading

ironcev commented Oct 6, 2024 •

edited

Loading

IGI-111 left a comment

ironcev commented Nov 4, 2024

tritao commented Nov 4, 2024

Rewrite TypeEngine for performance, robustness, and simplicity #6613

Rewrite TypeEngine for performance, robustness, and simplicity #6613

Conversation

ironcev commented Oct 6, 2024 • edited Loading

Description

Shareable types

Performance improvements

Applied optimizations:

Robustness related to source_ids

Simplicity

Checklist

codspeed-hq bot commented Oct 6, 2024 • edited Loading

CodSpeed Performance Report

Merging #6613 will improve performances by 13.56%

Summary

Benchmarks breakdown

ironcev commented Oct 6, 2024 • edited Loading

IGI-111 left a comment

Choose a reason for hiding this comment

ironcev commented Nov 4, 2024

tritao commented Nov 4, 2024

Rewrite `TypeEngine` for performance, robustness, and simplicity #6613

Rewrite `TypeEngine` for performance, robustness, and simplicity #6613

ironcev commented Oct 6, 2024 •

edited

Loading

Robustness related to `source_id`s

codspeed-hq bot commented Oct 6, 2024 •

edited

Loading

ironcev commented Oct 6, 2024 •

edited

Loading