diff --git a/docs/proposals/004-initialization.md b/docs/proposals/004-initialization.md new file mode 100644 index 0000000000..4605cd0309 --- /dev/null +++ b/docs/proposals/004-initialization.md @@ -0,0 +1,408 @@ +SP #004: Initialization +================= + +This proposal documents the desired behavior of initialization related language semantics, including default constructor, initialization list and variable initialization. + +Status +------ + +Status: Design Approved, implementation in-progress. + +Implemtation: N/A + +Author: Yong He + +Reviewer: Theresa Foley, Kai Zhang + +Background +---------- + +Slang has introduced several different syntax around initialization to provide syntactic compatibility with HLSL/C++. As the language evolve, there are many corners where +the semantics around initialization are not well-defined, and causing confusion or leading to surprising behaviors. + +This proposal attempts to provide a design on where we want the language to be in terms of how initialization is handled in all different places. + +Related Work +------------ + +C++ has many different ways and syntax to initialize an object: through explicit constructor calls, initialization list, or implicitly in a member/variable declaration. +A variable in C++ can also be in an uninitialized state after its declaration. HLSL inherits most of these behvior from C++ by allowing variables to be uninitialized. + +On the other hand, languages like C# and Swift has a set of well defined rules to ensure every variable is initialized after its declaration. +C++ allows using the initilization list syntax to initialize an object. The semantics of initialization lists depends on whether or not explicit constructors +are defined on the type. + +Proposed Approach +----------------- + +In this section, we document all concepts and rules related to initialization, constructors and initialization lists. + +### Default Initializable type + +A type is considered "default-initializable" if it provides a constructor that can take 0 arguments, so that it can be constructed with `T()`. + +### Variable Initialization + +Generally, a variable is considered uninitialized at its declaration site without an explicit value expression. +For example, +```csharp +struct MyType { int x ; } + +void foo() +{ + MyType t; // t is uninitialized. + var t1 : MyType; // same in modern syntax, t1 is uninitialized. +} +``` + +However, the Slang language has been allowing implicit initialization of variables whose types are default initializable types. +For example, +```csharp +struct MyType1 { + int x; + __init() { x = 0; } +} +void foo() { + MyType t1; // `t1` is initialized with a call to `__init`. +} +``` + +We would like to move away from this legacy behavior towards a consistent semantics of never implicitly initializing a variable. +To maintain backward compatibility, we will keep the legacy behavior, but remove the implicit initialization when the variable is defined +in modern syntax: +```csharp +void foo() { + var t1: MyType; // `t1` will no longer be initialized. +} +``` +We will also remove the default initilaization semantics for traditional syntax in modern Slang modules that comes with an explicit `module` declaration. + +Trying to use a variable without initializing it first is an error. +For backward compatibility, we will introduce a compiler option to turn this error into a warning, but we may deprecate this option in the future. + +### Generic Type Parameter + +A generic type parameter is not considered default-initializable by-default. As a result, the following code should leave `t` in an uninitialized state: +```csharp +void foo() +{ + T t; // `t` is uninitialized at declaration. +} +``` + +### Synthesis of constructors for member initialization + +If a type already defines any explicit constructors, do not synthesize any constructors for initializer list call. An intializer list expression +for the type must exactly match one of the explicitly defined constructors. + +If the type doesn't provide any explicit constructors, the compiler need to synthesize the constructors for the calls that that the intializer +lists translate into, so that an initializer list expression can be used to initialize a variable of the type. + +For each type, we will synthesize one constructor at the same visibility of the type itself: + +The signature for the synthesized initializer for type `V struct T` is: +```csharp +V T.__init(member0: typeof(member0) = default(member0), member1 : typeof(member1) = default(member1), ...) +``` +where `V` is a visibility modifier, `(member0, member1, ... memberN)` is the set of members that has visibility `V`, and `default(member0)` +is the value defined by the initialization expression in `member0` if it exist, or the default value of `member0`'s type. +If `member0`'s type is not default initializable and the the member doesn't provide an initial value, then the parameter will not have a default value. + +The synthesized constructor will be marked as `[Synthesized]` by the compiler, so the call site can inject additional compatibility logic when calling a synthesized constructor. + +The body of the constructor will initialize each member with the value comming from the corresponding constructor argument if such argument exists, +otherwise the member will be initialized to its default value either defined by the init expr of the member, or the default value of the type if the +type is default-initializable. If the member type is not default-initializable and a default value isn't provided on the member, then such the constructor +synthesis will fail and the constructor will not be added to the type. Failure to synthesis a constructor is not an error, and an error will appear +if the user is trying to initialize a value of the type in question assuming such a constructor exist. + +Note that if every member of a struct contains a default expression, the synthesized `__init` method can be called with 0 arguments, however, this will not cause a variable declaration to be implicitly initialized. Implicit initialization is a backward compatibility feature that only work for user-defined `__init()` methods. + +### Single argument constructor call + +Call to a constructor with a single argument is always treated as a syntactic sugar of type cast: +```csharp +int x = int(1.0f); // is treated as (int) 1.0f; +MyType y = MyType(arg); // is treated as (MyType)arg; +MyType x = MyType(y); // equivalent to `x = y`. +``` + +The compiler will attempt to resolve all type casts using type coercion rules, if that failed, will fall back to resolve it as a constructor call. + +### Initialization List + +Slang allows initialization of a variable by assigning it with an initialization list. +Generally, Slang will always try to resolve initialization list coercion as if it is an explicit constructor invocation. +For example, given: +```csharp +S obj = {1,2}; +``` +Slang will try to convert the code into: +```csharp +S obj = S(1,2); +``` + +Following the same logic, an empty initializer list will translate into a default-initialization: +```csharp +S obj = {}; +// equivalent to: +S obj = S(); +``` + +Note that initializer list of a single argument does not translate into a type cast, unlike the constructor call syntax. Initializing with a single element in the intializer list always translates directly into a constructor call. For example: +```csharp +void test() +{ + MyType t = {1}; + // translates to direct constructor call: + // MyType t = MyType.__init(1); + // which is NOT the same as: + // MyType t = MyType(t) + // or: + // MyType t = (MyType)t; +} +``` + +If the above code passes type check, then it will be used as the way to initialize `obj`. + +If the above code does not pass type check, and if there is only one constructor for`MyType` that is synthesized as described in the previous section (and therefore marked as `[Synthesized]`, Slang continues to check if `S` meets the standard of a "legacy C-style struct` type. +A type is a "legacy C-Style struct" if all of the following conditions are met: +- It is a user-defined struct type or a basic scalar, vector or matrix type, e.g. `int`, `float4x4`. +- It does not contain any explicit constructors defined by the user. +- All its members have the same visibility as the type itself. +- All its members are legacy C-Style structs or arrays of legacy C-style structs. +Note that C-Style structs are allowed to have member default values. +In such case, we perform a legacy "read data" style consumption of the initializer list to synthesize the arguments to call the constructor, so that the following behavior is valid: + +```csharp +struct Inner { int x; int y; }; +struct Outer { Inner i; Inner j; } + +// Initializes `o` into `{ Inner{1,2}, Inner{3,0} }`, by synthesizing the +// arguments to call `Outer.__init(Inner(1,2), Inner(3, 0))`. +Outer o = {1, 2, 3}; +``` + +If the type is not a legacy C-Style struct, Slang should produce an error. + +### Legacy HLSL syntax to cast from 0 + +HLSL allows a legacy syntax to cast from literal `0` to a struct type, for example: +```hlsl +MyStruct s { int x; } +void test() +{ + MyStruct s = (MyStruct)0; +} +``` + +Slang treats this as equivalent to a empty-initialization: +```csharp +MyStruct s = (MyStruct)0; +// is equivalent to +MyStruct s = {}; +``` + +Examples +------------------- +```csharp + +// Assume everything below is public unless explicitly declared. + +struct Empty +{ + // compiler synthesizes: + // __init(); +} +void test() +{ + Empty s0 = {}; // Works, `s` is considered initialized via ctor call. + Empty s1; // `s1` is considered uninitialized. +} + +struct CLike +{ + int x; int y; + // compiler synthesizes: + // __init(int x, int y); +} +void test1() +{ + CLike c0; // `c0` is uninitialized. + + // case 1: initialized with synthesized ctor call using legacy logic to form arguments, + // and `c1` is now `{0,0}`. + // (we will refer to this scenario as "initialized with legacy logic" for + // the rest of the examples): + CLike c1 = {}; + + // case 2: initialized with legacy initializaer list logic, `c1` is now `{1,0}`: + CLike c2 = {1}; + + // case 3: initilaized with ctor call `CLike(1,2)`, `c3` is now `{1,2}`: + CLike c3 = {1, 2}; +} + +struct ExplicitCtor +{ + int x; + int y; + __init(int x) {...} + // compiler does not synthesize any ctors. +} +void test2() +{ + ExplicitCtor e0; // `e0` is uninitialized. + ExplicitCtor e1 = {1}; // calls `__init`. + ExplicitCtor e2 = {1, 2}; // error, no ctor matches initializer list. +} + +struct DefaultMember { + int x = 0; + int y = 1; + // compiler synthesizes: + // __init(int x = 0, int y = 1); +} +void test3() +{ + DefaultMember m; // `m` is uninitialized. + DefaultMember m1 = {}; // calls `__init()`, initialized to `{0,1}`. + DefaultMember m2 = {1}; // calls `__init(1)`, initialized to `{1,1}`. + DefaultMember m3 = {1,2}; // calls `__init(1,2)`, initialized to `{1,2}`. +} + +struct PartialInit { + // warning: not all members are initialized. + // members should either be all-uninitialized or all-initialized with + // default expr. + int x; + int y = 1; + // compiler synthesizes: + // __init(int x, int y = 1); +} +void test4() +{ + PartialInit i; // `i` is not initialized. + PartialInit i1 = {2}; // calls `__init`, result is `{2,1}`. + PartialInit i2 = {2, 3}; // calls `__init`, result is {2, 3} +} + +struct PartialInit2 { + int x = 1; + int y; // warning: not all members are initialized. + // compiler synthesizes: + // __init(int x, int y); +} +void test5() +{ + PartialInit2 j; // `j` is not initialized. + PartialInit2 j1 = {2}; // error, no ctor match. + PartialInit2 j2 = {2, 3}; // calls `__init`, result is {2, 3} +} + +public struct Visibility1 +{ + internal int x; + public int y = 0; + // the compiler does not synthesize any ctor. + // the compiler will try to synthesize: + // public __init(int y); + // but then it will find that `x` cannot be initialized. + // so this synthesis will fail and no ctor will be added + // to the type. +} +void test6() +{ + Visibility1 t = {0, 0}; // error, no matching ctor + Visibility1 t1 = {}; // error, no matching ctor + Visibility1 t2 = {1}; // error, no matching ctor +} + +public struct Visibility2 +{ + // Visibility2 type contains members of different visibility, + // which disqualifies it from being considered as C-style struct. + // Therefore we will not attempt the legacy fallback logic for + // initializer-list syntax. + internal int x = 1; + public int y = 0; + // compiler synthesizes: + // public __init(int y = 0); +} +void test7() +{ + Visibility2 t = {0, 0}; // error, no matching ctor. + Visibility2 t1 = {}; // OK, initialized to {1,0} via ctor call. + Visibility2 t2 = {1}; // OK, initialized to {1,1} via ctor call. +} + +internal struct Visibility3 +{ + // Visibility3 type is considered as C-style struct. + // Because all members have the same visibility as the type. + // Therefore we will attempt the legacy fallback logic for + // initializer-list syntax. + // Note that c-style structs can still have init exprs on members. + internal int x; + internal int y = 2; + // compiler synthesizes: + // internal __init(int x, int y = 2); +} +internal void test8() +{ + Visibility3 t = {0, 0}; // OK, initialized to {0,0} via ctor call. + Visibility3 t1 = {1}; // OK, initialized to {1,2} via ctor call. + Visibility3 t2 = {}; // OK, initialized to {0, 2} via legacy logic. +} + +internal struct Visibility4 +{ + // Visibility4 type is considered as C-style struct. + // And we still synthesize a ctor for member initialization. + // Because Visiblity4 has no public members, the synthesized + // ctor will take 0 arguments. + internal int x = 1; + internal int y = 2; + // compiler synthesizes: + // internal __init(int x = 1, int y = 2); +} +internal void test9() +{ + Visibility4 t = {0, 0}; // OK, initialized to {0,0} via ctor call. + Visibility4 t1 = {3}; // OK, initialized to {3,2} via ctor call. + Visibility4 t2 = {}; // OK, initialized to {1,2} via ctor call. +} +``` + +### Zero Initialization + +The Slang compiler supported an option to force zero-initialization of all local variables. +This is currently implemented by adding `IDefaultInitializable` conformance to all user +defined types. With the direction we are heading, we should remove this option in the future. +For now we can continue to provide this functionality but through an IR rewrite pass instead +of changing the frontend semantics. + +When users specifies `-zero-initialize`, we should still use the same front-end logic for +all the checking. After lowering to IR, we should insert a `store` after all `IRVar : T` to +initialize them to `defaultConstruct(T)`. + + +Q&A +----------- + +### Should global static and groupshared variables be default initialized? + +Similar to local variables, all declarations are not default initialized at its declaration site. +In particular, it is difficult to efficiently initialized global variables safely and correctly in a general way on platforms such as Vulkan, +so implicit initialization for these variables can come with serious performance consequences. + +### Should `out` parameters be default initialized? + +Following the same philosphy of not initializing any declarations, `out` parameters are also not default-initialized. + +Alternatives Considered +----------------------- + +One important decision point is whether or not Slang should allow variables to be left in uninitialized state after its declaration as it is allowed in C++. In contrast, C# forces everything to be default initialized at its declaration site, which come at the cost of incurring the burden to developers to come up with a way to define the default value for each type. +Our opinion is we want to allow things as uninitialized, and to have the compiler validation checks to inform +the developer something is wrong if they try to use a variable in uninitialized state. We believe it is desirable to tell the developer what's wrong instead of using a heavyweight mechanism to ensure everything is initialized at declaration sites, which can have non-trivial performance consequences for GPU programs, especially when the variable is declared in groupshared memory. \ No newline at end of file