Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite TypeEngine for performance, robustness, and simplicity #6613

Merged
merged 38 commits into from
Nov 6, 2024

Conversation

ironcev
Copy link
Member

@ironcev ironcev commented Oct 6, 2024

Description

This PR implements a major rewrite of the TypeEngine. The rewrite:

The PR also removes the obsolete and unused TypeInfo::Storage.

The PR does not address the following:

Closes #5991.

Shareable types

The PR formalizes the notion of a shareable type within the TypeEngine (strictly speaking, a shareable TypeInfo). A shareable type is a type that is both:

  • unchangeable: it, or any of its parts, cannot be replaced during the unification or monomorphization.
  • and undistinguishable by annotations: it doesn't carry any additional information (annotations) that could differ it from another instance of TypeInfo that would, purely from the type perspective, be a same type.

E.g., u64 or MyStruct<u64, bool> are unchangeable while Numeric, Unknown or MyStruct<T1, T1> are changeable.

E.g., in this example, a and b have the same type, [u64; 2] but those two types differ in spans assigned to the "u64"s and the "2"s, and are treated as different within the TypeEngine, and thus as non-shareable.

let a: [u64; 2] = [1, 1];
let b: [u64; 2] = [2, 2];

Shareability of a type is crucial for reusing the TypeSourceInfo instances within the type engine. Shareable types can be given different TypeIds without the need to newly allocate a TypeSourceInfo.

Performance improvements

The cummulative effect of the performance improvements on the compilataion of the real-world Spark Orderbook workspace is given below. The compilation means the frontend compilation, up to the IR generation (forc check).

Compilation time

Before:
  Time (mean ± σ):      8.995 s ±  0.297 s    [User: 7.126 s, System: 1.289 s]
  Range (min … max):    8.675 s …  9.262 s    3 runs

After:
  Time (mean ± σ):      5.945 s ±  0.517 s    [User: 4.749 s, System: 0.768 s]
  Range (min … max):    5.349 s …  6.280 s    3 runs

Memory consumption

---------------------------------------------------------
                total(B)   useful-heap(B)   extra-heap(B)
---------------------------------------------------------
Before:    3,316,786,808    3,237,317,383      79,469,425
After:     1,784,467,376    1,743,772,406      40,694,970

Applied optimizations:

  • Replacement of expensive insert calls with compile time constants for built-in types. Built-in types like !, bool, (), u8, etc. are inserted into the engine at its creation at predefined slots within the slab. The id_of_<built-in-type> methods just return those predefined TypeIds, effectively being compiled down to constants. The calls like type_engine.insert(engines, TypeInfo::Boolean, None) are replaced with maximally optimized and non-verbose type_engine.id_of_bool().

  • Elimination of extensive creation of TypeSourceInfos for TypeInfo::Unknown/Numerics. Unknown and Numeric are inserted into the engine ~50.000 times. Each insert used to create a new instance of TypeSourceInfo with the source_id set to None. The optimization replaces those ~50.000 instances with two predefined singleton instances, one for Unknown + None and one for Numeric + None. (Note that when implementing Optimize TypeEngine for garbage collection #6603, we will want to bind also Unknowns and Numerics to source_ids different then None, but we will still want to reuse the TypeInfo instances.)

  • Elimination of extensive temporary heap-allocations during hash calculation. The custom hasher obtained by make_hasher required a TypeSourceInfo to calculate the hash and it was called every time the insert was called, ~530.000 times. Getting the TypeSourceInfo originally required cloning the TypeInfo, which depending on the concrete TypeInfo instance could cause heap allocations, and also heap-allocating that copy within an Arc. Hash was calculated regardless of the possibility for the type to be stored in the hash map of reusable types. The optimization removed the hashing step if the type is not shareable, removed the cloning of the TypeInfo and introduced a custom compute_hash_without_heap_allocation method that produced the same hash as the make_hasher but without unnecessary temporary heap-allocations of TypeSourceInfos.

  • Replacement of TypeSourceInfos within the hash map with Arc<TypeSourceInfo>s. The hash map unnecessarily required making copies of reused TypeSourceInfos.

  • Introducing the concept of a shareable type and rolling it out for all types. Previously, the engine checked only for changeability of types by using the TypeInfo::is_changeable method. The implementation of that method was extremely simplified, e.g. treating all structs and enums with generic arguments as changeable even if they were fully monomorphized. This resulted in over-bloating the engine with type instances that were actually unchangeable, but considered to be changeable. Also, strictly seen, the unchangeability (during unification and monomorphization) is not the only necessary criteria for reuse (or sharing) a type within the engine. Another important aspect is, as explained above, the differentiability by annotations. The PR introduces the notion of a shareable type which is both unchangeable and not differentiable by annotations. The optimization takes advantage of such types by storing them only once per source_id.

  • Elimination of extensive unnecessary inserts of new types during replace() calls. When replace()ing types during unifications, a new TypeSourceInfo instance was created for every replacement, ~46.000 times. This meant a heap-allocation of the TypeSourceInfo (16 bytes) and the Arc-contained TypeInfo (232 bytes), even if the replaced type was shareable and already available in the type engine. The optimization now reuses an already existing shareable type if available.

Robustness related to source_ids

The issues we had with properly providing the source_id in the insert method are explained in #5991. This PR removes the source_id from the new public API and calculates it internally within the engine, based on the type being inserted. This makes inserting of types both more robust and less verbose and eliminates the possibility of providing a semantically wrong source_id.

Note that the calculation of an optimal source_id done within the engine fully corresponds to the "calculations" we currently have at call sites.

E.g., when inserting enums, previously we always had to write type_engine.insert(engines, TypeInfo::Enum(decl_id), enum_decl.span.source_id()). Fetching the source_id from the enum declaration is now done within the insert_enum method: type_engine.insert_enum(engines, decl_id).

Note that for certain types we will want to change the current behavior and either provide a source_id or pick a more suitable one. E.g, even when inserting Unknowns, we will want to have source_ids, if possible. This will be done in #6603, but again, together with providing a robust API that will be difficult to misuse.

Simplicity

As already mentioned in some of the examples above, the new id_of_<type>() and insert_<type> methods provide much simpler and less verbose API to use. Introducing those methods remove a lot of boilerplate code from the callers. Also, the individual insert_<type> methods are additionally optimized for inserting the particular type.

The common insert method is intended to be used only in cases where the inserted TypeInfo is not known at the call site.

Here are some examples of the new API versus the existing one:

Before After
type_engine.insert(engines, TypeInfo::Tuple(vec![]), None) type_engine.id_of_unit()
type_engine.insert(engines, TypeInfo::Unknown, None) type_engine.new_unknown()
type_engine.insert(engines, TypeInfo::UnsignedInteger(IntegerBits::SixtyFour), None) type_engine.id_of_u64()

For more complex types, like, e.g., TypeInfo::Tuple, the difference in inserting is even more prominent, as can be seen from the diffs of numerous code lines deleted in this PR.

Checklist

  • I have linked to any relevant issues.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation where relevant (API docs, the reference, and the Sway book).
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added (or requested a maintainer to add) the necessary Breaking* or New Feature labels where relevant.
  • I have done my best to ensure that my PR adheres to the Fuel Labs Code Review Standards.
  • I have requested a review from the relevant team or maintainers.

Copy link

codspeed-hq bot commented Oct 6, 2024

CodSpeed Performance Report

Merging #6613 will improve performances by 13.56%

Comparing ironcev/rewrite-type-engine (193e06f) with master (16318d3)

Summary

⚡ 1 improvements
✅ 21 untouched benchmarks

Benchmarks breakdown

Benchmark master ironcev/rewrite-type-engine Change
compile 6.1 s 5.4 s +13.56%

@ironcev ironcev self-assigned this Oct 6, 2024
@ironcev ironcev added compiler General compiler. Should eventually become more specific as the issue is triaged compiler: frontend Everything to do with type checking, control flow analysis, and everything between parsing and IRgen labels Oct 6, 2024
@ironcev
Copy link
Member Author

ironcev commented Oct 6, 2024

EDIT: The transient bug whose symptoms are described below is now fixed. The issue was in an already existing (long existing 😉) underlying bug in the PartialEqWithEngines implementation for the TypeInfo::Tuple, which is now fixed as a part of this PR.


While working on the improvements I've noticed an E2E test suddenly failing with this error message:

 7 │+  --> sway-lib-core/src/slice.sw:7:24
 8 │+   |
 9 │+ 5 |   impl<T> &__slice[T] {
10 │+ 6 |       pub fn ptr(self) -> raw_ptr {
11 │+ 7 |           let (ptr, _) = asm(s: self) {
12 │+   |  ________________________^
13 │+ 8 | |             s: (raw_ptr, u64)
14 │+ 9 | |         };
15 │+   | |_________^ Mismatched types.
16 │+expected: ()
17 │+found:    (pointer, u64).
18 │+help: Variable declaration's type annotation does not match up with the assigned expression's type.
19 │+10 |           ptr
20 │+11 |       }

Note that the wrongly reported compile error is in core library, means it should affect almost all the tests but it didn't. Indeed, re-running that test didn't reproduce the error.

I've kept re-running the E2E suit on another machine for two days straight (!!) before the error got provoked again, this time in another file:

   --> sway-lib-std/src/math.sw:316:28
    |
314 |           }
315 |   
316 |           let (a, b, c, d) = asm(r1: self) {
    |  ____________________________^
317 | |             r1: (u64, u64, u64, u64)
318 | |         };
    | |_________^ Mismatched types.
expected: ()
found:    (u64, u64, u64, u64).
help: Variable declaration's type annotation does not match up with the assigned expression's type.
319 |           if a != 0 {
320 |               return a.log2().as_u256() + 0xc0u256;

Up to now, I couldn't find a logical error in code, and chasing a non-reproducible issue is difficult indeed. But the issue is for sure still there. After opening this draft PR, the forc-unit-tests failed because of the same reason. It is green now, but the cargo-run-e2e-test-release failed on one test I am sure for the same reason.

It is interesting (and for sure good!), that the tests are failing on the build machine, but when run locally it takes literally tens of thousand of compilations to get the error.

EDIT: All tests are green now, but that was to expect. I'll keep the PR in draft until I am sure that the issue is actually fixed.

Co-authored-by: Joshua Batty <joshpbatty@gmail.com>
@ironcev ironcev marked this pull request as ready for review October 22, 2024 14:21
@ironcev ironcev requested review from a team as code owners October 22, 2024 14:21
IGI-111
IGI-111 previously approved these changes Oct 22, 2024
Copy link
Contributor

@IGI-111 IGI-111 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the new API. :shipit:

@IGI-111 IGI-111 requested a review from a team October 22, 2024 15:19
@ironcev
Copy link
Member Author

ironcev commented Nov 4, 2024

All the GC issues reported in #6665, are now fixed on this branch. Similarly to the transient bug explained above, the root causes for GC related issues were already existing and the rewrite of the TypeEngine made them occur more frequently.

This PR now fixes, or at least mitigate to an extent, those GC issues which were possible to fix by changing only the TypeEngine internals.

However, GC still has issues whose solving requires restructurings outside of the TypeEngine. E.g. fixing #6687 requires changes in the DeclEngine as well.

The question is now how to proceed with this PR?

My thinking is to merge it as it is now, although knowing that not all of the GC issues are solved. The reasoning behind that is the following:

  • this version already has more stable GC then the one on the master.
    When running GC tests on our whole E2E test suite (both should_pass and should_fail test) only one test fails in this PR (should_pass/stdlib/option_eq/src/main.sw), while numerous tests fail on master.

  • the option_eq test fails on master as well, proving that this is not a new error caused by the changes in this branch.

  • I've manually tried LSP from this PR on several representative projects end never experienced a crash. (Of course, should_pass/stdlib/option_eq project will cause LSP crash.)

  • this PR is already vast in scope and brings significant changes. Extending it to rewrite the DeclEngine and address other GC related topics would be a huge scope-explosion. We originally planned to tackle the DeclEngine and "GC-friendliness" in separate PRs.

  • keeping this PR on the top of other changes already proved to be a high effort. The PR touches the most prominent files on the frontend and is in constant collision with other changes done on the frontend.

@sdankel @JoshuaBatty @tritao @IGI-111 What do you think about this proposal? Can you also please try the local version of the LSP on your projects to see if you can experience any crashes?

@tritao
Copy link
Contributor

tritao commented Nov 4, 2024

Makes sense to me to get this one merged ASAP.

tritao
tritao previously approved these changes Nov 4, 2024
@ironcev ironcev marked this pull request as ready for review November 4, 2024 15:23
Cargo.toml Outdated Show resolved Hide resolved
@tritao tritao merged commit 73e1a7a into master Nov 6, 2024
40 checks passed
@tritao tritao deleted the ironcev/rewrite-type-engine branch November 6, 2024 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler: frontend Everything to do with type checking, control flow analysis, and everything between parsing and IRgen compiler General compiler. Should eventually become more specific as the issue is triaged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove error-prone source_id from TypeEngine::insert() and make inserting API more robust and less verbose
6 participants