-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite TypeEngine
for performance, robustness, and simplicity
#6613
Conversation
CodSpeed Performance ReportMerging #6613 will improve performances by 13.56%Comparing Summary
Benchmarks breakdown
|
EDIT: The transient bug whose symptoms are described below is now fixed. The issue was in an already existing (long existing 😉) underlying bug in the While working on the improvements I've noticed an E2E test suddenly failing with this error message:
Note that the wrongly reported compile error is in I've kept re-running the E2E suit on another machine for two days straight (!!) before the error got provoked again, this time in another file:
Up to now, I couldn't find a logical error in code, and chasing a non-reproducible issue is difficult indeed. But the issue is for sure still there. After opening this draft PR, the It is interesting (and for sure good!), that the tests are failing on the build machine, but when run locally it takes literally tens of thousand of compilations to get the error. EDIT: All tests are green now, but that was to expect. I'll keep the PR in draft until I am sure that the issue is actually fixed. |
sway-core/src/semantic_analysis/ast_node/declaration/impl_trait.rs
Outdated
Show resolved
Hide resolved
Co-authored-by: Joshua Batty <joshpbatty@gmail.com>
Co-authored-by: João Matos <joao@tritao.eu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love the new API.
All the GC issues reported in #6665, are now fixed on this branch. Similarly to the transient bug explained above, the root causes for GC related issues were already existing and the rewrite of the This PR now fixes, or at least mitigate to an extent, those GC issues which were possible to fix by changing only the However, GC still has issues whose solving requires restructurings outside of the The question is now how to proceed with this PR? My thinking is to merge it as it is now, although knowing that not all of the GC issues are solved. The reasoning behind that is the following:
@sdankel @JoshuaBatty @tritao @IGI-111 What do you think about this proposal? Can you also please try the local version of the LSP on your projects to see if you can experience any crashes? |
Makes sense to me to get this one merged ASAP. |
Description
This PR implements a major rewrite of the
TypeEngine
. The rewrite:source_id
(see: Remove error-pronesource_id
fromTypeEngine::insert()
and make inserting API more robust and less verbose #5991). The new API forbids not providingsource_id
s or providing semantically questionable or non-optimal ones.The PR also removes the obsolete and unused
TypeInfo::Storage
.The PR does not address the following:
TypeEngine
's "garbage-collection-(un)friendlines" (see: OptimizeTypeEngine
for garbage collection #6603).TypeInfo::Custom
andTypeInfo::TraitType
types within theTypeEngine
(see: Improve handling ofTypeInfo::Custom
andTypeInfo::TraitType
within theTypeEngine
#6601).TypeEngine
. E.g., removing unnecessary insertions afterresolve()
ing ormonomorphyze()
ing types will be done as a part ofDeclEngine
optimization.Closes #5991.
Shareable types
The PR formalizes the notion of a shareable type within the
TypeEngine
(strictly speaking, a shareableTypeInfo
). A shareable type is a type that is both:TypeInfo
that would, purely from the type perspective, be a same type.E.g.,
u64
orMyStruct<u64, bool>
are unchangeable whileNumeric
,Unknown
orMyStruct<T1, T1>
are changeable.E.g., in this example,
a
andb
have the same type,[u64; 2]
but those two types differ in spans assigned to the "u64"s and the "2"s, and are treated as different within theTypeEngine
, and thus as non-shareable.Shareability of a type is crucial for reusing the
TypeSourceInfo
instances within the type engine. Shareable types can be given differentTypeId
s without the need to newly allocate aTypeSourceInfo
.Performance improvements
The cummulative effect of the performance improvements on the compilataion of the real-world Spark Orderbook workspace is given below. The compilation means the frontend compilation, up to the IR generation (
forc check
).Compilation time
Memory consumption
Applied optimizations:
Replacement of expensive
insert
calls with compile time constants for built-in types. Built-in types like!
,bool
,()
,u8
, etc. are inserted into the engine at its creation at predefined slots within theslab
. Theid_of_<built-in-type>
methods just return those predefinedTypeId
s, effectively being compiled down to constants. The calls liketype_engine.insert(engines, TypeInfo::Boolean, None)
are replaced with maximally optimized and non-verbosetype_engine.id_of_bool()
.Elimination of extensive creation of
TypeSourceInfo
s forTypeInfo::Unknown/Numeric
s.Unknown
andNumeric
are inserted into the engine ~50.000 times. Each insert used to create a new instance ofTypeSourceInfo
with thesource_id
set toNone
. The optimization replaces those ~50.000 instances with two predefined singleton instances, one forUnknown
+None
and one forNumeric
+None
. (Note that when implementing OptimizeTypeEngine
for garbage collection #6603, we will want to bind alsoUnknown
s andNumeric
s tosource_id
s different thenNone
, but we will still want to reuse theTypeInfo
instances.)Elimination of extensive temporary heap-allocations during hash calculation. The custom hasher obtained by
make_hasher
required aTypeSourceInfo
to calculate the hash and it was called every time theinsert
was called, ~530.000 times. Getting theTypeSourceInfo
originally required cloning theTypeInfo
, which depending on the concreteTypeInfo
instance could cause heap allocations, and also heap-allocating that copy within anArc
. Hash was calculated regardless of the possibility for the type to be stored in the hash map of reusable types. The optimization removed the hashing step if the type is not shareable, removed the cloning of theTypeInfo
and introduced a customcompute_hash_without_heap_allocation
method that produced the same hash as themake_hasher
but without unnecessary temporary heap-allocations ofTypeSourceInfo
s.Replacement of
TypeSourceInfo
s within the hash map withArc<TypeSourceInfo>
s. The hash map unnecessarily required making copies of reusedTypeSourceInfo
s.Introducing the concept of a shareable type and rolling it out for all types. Previously, the engine checked only for changeability of types by using the
TypeInfo::is_changeable
method. The implementation of that method was extremely simplified, e.g. treating all structs and enums with generic arguments as changeable even if they were fully monomorphized. This resulted in over-bloating the engine with type instances that were actually unchangeable, but considered to be changeable. Also, strictly seen, the unchangeability (during unification and monomorphization) is not the only necessary criteria for reuse (or sharing) a type within the engine. Another important aspect is, as explained above, the differentiability by annotations. The PR introduces the notion of a shareable type which is both unchangeable and not differentiable by annotations. The optimization takes advantage of such types by storing them only once persource_id
.Elimination of extensive unnecessary inserts of new types during
replace()
calls. Whenreplace()
ing types during unifications, a newTypeSourceInfo
instance was created for every replacement, ~46.000 times. This meant a heap-allocation of theTypeSourceInfo
(16 bytes) and theArc
-containedTypeInfo
(232 bytes), even if the replaced type was shareable and already available in the type engine. The optimization now reuses an already existing shareable type if available.Robustness related to
source_id
sThe issues we had with properly providing the
source_id
in theinsert
method are explained in #5991. This PR removes thesource_id
from the new public API and calculates it internally within the engine, based on the type being inserted. This makes inserting of types both more robust and less verbose and eliminates the possibility of providing a semantically wrongsource_id
.Note that the calculation of an optimal
source_id
done within the engine fully corresponds to the "calculations" we currently have at call sites.E.g., when inserting enums, previously we always had to write
type_engine.insert(engines, TypeInfo::Enum(decl_id), enum_decl.span.source_id())
. Fetching thesource_id
from the enum declaration is now done within theinsert_enum
method:type_engine.insert_enum(engines, decl_id)
.Note that for certain types we will want to change the current behavior and either provide a
source_id
or pick a more suitable one. E.g, even when insertingUnknown
s, we will want to havesource_id
s, if possible. This will be done in #6603, but again, together with providing a robust API that will be difficult to misuse.Simplicity
As already mentioned in some of the examples above, the new
id_of_<type>()
andinsert_<type>
methods provide much simpler and less verbose API to use. Introducing those methods remove a lot of boilerplate code from the callers. Also, the individualinsert_<type>
methods are additionally optimized for inserting the particular type.The common
insert
method is intended to be used only in cases where the insertedTypeInfo
is not known at the call site.Here are some examples of the new API versus the existing one:
type_engine.insert(engines, TypeInfo::Tuple(vec![]), None)
type_engine.id_of_unit()
type_engine.insert(engines, TypeInfo::Unknown, None)
type_engine.new_unknown()
type_engine.insert(engines, TypeInfo::UnsignedInteger(IntegerBits::SixtyFour), None)
type_engine.id_of_u64()
For more complex types, like, e.g.,
TypeInfo::Tuple
, the difference in inserting is even more prominent, as can be seen from the diffs of numerous code lines deleted in this PR.Checklist
Breaking*
orNew Feature
labels where relevant.