Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce overhead and indirection in enum and namespace code generation #54244

Open
4 of 5 tasks
dead-claudia opened this issue May 14, 2023 · 7 comments
Open
4 of 5 tasks
Labels
Awaiting More Feedback This means we'd like to hear from more people who would be helped by this feature Suggestion An idea for TypeScript

Comments

@dead-claudia
Copy link

dead-claudia commented May 14, 2023

Suggestion

πŸ” Search Terms

enum emit

βœ… Viability Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript/JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. library functionality, non-ECMAScript syntax with JavaScript output, new syntax sugar for JS, etc.)
  • This feature would agree with the rest of TypeScript's Design Goals.

⭐ Suggestion

Note: this is a partial dupe of #27604 but has a few key differences:

  1. It proposes a non-breaking change instead.
  2. It extends the change to namespaces as well.
  3. It in theory isn't tree-shakeable, but can still be optimized out by optimizers.

Suppose we have this code:

namespace Foo {
    export function bar() { return 1 }
}

enum Foo {
    One,
    Two,
    Three,
}

Currently, this results in the following emit:

"use strict";
var Foo;
(function (Foo) {
    function bar() { return 1; }
    Foo.bar = bar;
})(Foo || (Foo = {}));
(function (Foo) {
    Foo[Foo["One"] = 0] = "One";
    Foo[Foo["Two"] = 1] = "Two";
    Foo[Foo["Three"] = 2] = "Three";
})(Foo || (Foo = {}));

A much better emit would be this (comments explaining each bit):

"use strict";
// Forward declare the namespace
// In modules and functions, this should just be `var Foo = {}`
var Foo = Foo || {};

// `namespace Foo {`
// Deduplicate like you already do with `let`/`const` when targeting ES5
function bar() { return 1; }
Foo.bar = bar;
// `}`

// `enum Foo {`
// `  One,`
Foo.One = 0;
Foo[0] = "One";
// `  Two,`
Foo.Two = 1;
Foo[1] = "Two";
// `  Three,`
Foo.Three = 2;
Foo[2] = "Three";
// `}`

This would carry the same observable semantics it currently does (complete with Object.prototype observability).

You may be able to go one step further and just build the object literal directly. This of course is observable (Object.prototype setters would not be invoked), but would result in ideal code for most cases. You'd only be able to do this for cases like modules, though.

"use strict";
// Hoist `bar`
function bar() { return 1; }
var Foo = {
// `namespace Foo {`
    bar: bar,
// `}`

// `enum Foo {`
// `One,`
    One: 0,
    0: "One",
// `Two,`
    Two: 1,
    1: "Two",
// `Three,`
    Three: 2,
    2: "Three",
// `}`
}

Do want to note that while this can technically be larger for a few enums, it'll only be that for a few, and it'll almost certainly be a wash after compression as well.

πŸ“ƒ Motivating Example

This reduces namespace and enum code gen overhead significantly and also make it much easier to optimize for both engines (on startup) and optimizer tools.

Currently, Terser has two issues around TypeScript enum generation, and this would serve to fix both.

The first issue linked, terser/terser#1064, features this code:

enum FooEnum {
    ONE,
    TWO,
    THREE
}

console.log(FooEnum.ONE, FooEnum[0])

The current emit when targeting modules is this:

var FooEnum;
(function (FooEnum) {
    FooEnum[FooEnum["ONE"] = 0] = "ONE";
    FooEnum[FooEnum["TWO"] = 1] = "TWO";
    FooEnum[FooEnum["THREE"] = 2] = "THREE";
})(FooEnum || (FooEnum = {}));
console.log(FooEnum.ONE, FooEnum[0]);

Terser, with --module --mangle --compress passes=2, minifies it to this (whitespace added for clarity):

var E;
!function(E){
    E[E.ONE=0]="ONE",
    E[E.TWO=1]="TWO",
    E[E.THREE=2]="THREE"
}(E||(E={})),
console.log(E.ONE,E[0]);

My proposed emit would be this:

var FooEnum = {};
FooEnum.ONE = 0;
FooEnum[0] = "ONE";
FooEnum.TWO = 1;
FooEnum[1] = "TWO";
FooEnum.THREE = 2;
FooEnum[2] = "THREE";

console.log(FooEnum.ONE, FooEnum[0])

Terser with the same settings compresses it to just this:

var o=0,l="ONE";console.log(o,l);

Adding one more pass (--compress passes=3) allows it to complete the enum inlining:

console.log(0,"ONE");

Worth noting that Terser by default only performs one pass. This should probably be called out in whatever blog post for visibility.

πŸ’» Use Cases

The current approach just quite frankly is extremely difficult to optimize for. Not only is it bloated, but it's also difficult to detect for minification purposes - the motivating example elaborates on this further.

It'd also boost startup speed, even absent the minifier optimizations, since it's only setting properties, not also going through the ceremony of IIFEs plus Enum || (Enum = {}).

Also, consider this code:

export enum Foo {
    One,
    Two,
    Three,
}

The code it generates is this:

export var Foo;
(function (Foo) {
    Foo[Foo["One"] = 0] = "One";
    Foo[Foo["Two"] = 1] = "Two";
    Foo[Foo["Three"] = 2] = "Three";
})(Foo || (Foo = {}));

Terser, with --module --mangle --compress passes=2, minifies it to this (whitespace added for clarity):

export var Foo;
!function(o){
    o[o.One=0]="One",
    o[o.Two=1]="Two",
    o[o.Three=2]="Three"
}(Foo||(Foo={}));

My proposed emit would be this:

// Proposed
export var Foo = {};
Foo.One = 0;
Foo[0] = "One";
Foo.Two = 1;
Foo[1] = "Two";
Foo.Three = 2;
Foo[2] = "Three";

// Ideal
export var Foo = {
    One: 0,
    0: "One",
    Two: 1,
    1: "Two",
    Three: 2,
    2: "Three",
};

Once terser/terser#1389 gets resolved (it's about the export specifically), Terser should minify the proposed one to this:

export var Foo={One:0,0:"One",Two:1,1:"Two",Three:2,2:"Three"};

The "ideal" code would just minify to this without the complicated multi-pass processing.

With esbuild --minify, it's not as sophisticated, but benefits are still apparent. Here are the current and new emits side-by-side with it, to show the difference:

// Current
export var Foo;(function(e){e[e.One=0]="One",e[e.Two=1]="Two",e[e.Three=2]="Three"})(Foo||(Foo={}));
// Proposed
var e={};e.One=0,e[0]="One",e.Two=1,e[1]="Two",e.Three=2,e[2]="Three";export{e as Foo};
// Ideal
export var Foo={One:0,0:"One",Two:1,1:"Two",Three:2,2:"Three"};
@jcalz
Copy link
Contributor

jcalz commented May 14, 2023

This doesn’t support merging, right? So it would be a breaking change.

Given how weird the current emit is, I think any proposal to change it would ideally acknowledge why the current version is the way it is (or else we presume that it was made weird for no reason) and then argue that the proposed changes would serve those purposes equally well or better… or why those purposes don’t need to be served anymore.

@fatcerberus
Copy link

If what you ultimately want is for the enum values to be inlined, why not just skip a step and let TS do that for you via const enum?

@dead-claudia
Copy link
Author

This doesn’t support merging, right? So it would be a breaking change.

@jcalz You'd need to hoist the "get or initialize with empty object" logic (to a var Foo = Foo || {} in global script contexts). If the target is a module or the enum is defined inside a function, that step could be reduced to var Foo = {} since the binding is initialized to undefined when hoisted and the Foo in Foo || {} would have been accessed before the initializing assignment completes.

I've updated the initial comment to clear this up a bit. I tried accounting for declaration merging (and did address it within single files), but I missed the case of global enums and (more importantly) namespaces defined across multiple separately-compiled scripts, so thanks for the callout.

And yes, I'm acutely aware of why it is the way it is right now. Just trying to find ways to make it easier for optimizers (both in engines and in CLI tools) to make sense of it without compromising functionality.

@dead-claudia
Copy link
Author

@fatcerberus Unfortunately, const enums aren't always possible to use. Notably, anyone using TS-transpiling tools like @babel/preset-typescript, esbuild, or https://swc.rs/ can't - they instead have to use isolatedModules: true just to ensure runtime safety, and this means cross-file const enum references just aren't possible.

And this choice isn't always made by the individual developer directly - it's sometimes made by a framework (like Vite) on their behalf.

@Josh-Cena
Copy link
Contributor

Josh-Cena commented May 14, 2023

Both enums and namespaces create their own scope which is broken in the proposed emit. For example, in your version the bar function can be called anywhere in the module. You would also not be able to deal with this:

enum Foo {
  ONE,
  TWO = ONE,
}

...Unless you emit every enum name as a var, but that's just too many scope pollutions.

@dead-claudia
Copy link
Author

dead-claudia commented May 15, 2023

@Josh-Cena Have you actually looked at the current emit of that? If you look closely, the existing emit doesn't even reference Foo.ONE anywhere, nor does it declare any variable ONE. Also, it's still declared with var.

@RyanCavanaugh RyanCavanaugh added Suggestion An idea for TypeScript Awaiting More Feedback This means we'd like to hear from more people who would be helped by this feature labels May 15, 2023
@remcohaszing
Copy link

If the namespace assignment is moved outside of the iife invocation, it becomes just a regular iife. This is already simpler to optimize for minifiers.

For example. the following code:

namespace Top {
  export function fn1(): void {}
}

namespace Top {
  export namespace Nested {
    export function fn2(): void {}
  }
}

namespace Top.Nested {
  export function fn3(): void {}
}

enum Top {
  a,
  b
}

Compiles to:

"use strict";
var Top;
(function (Top) {
    function fn1() { }
    Top.fn1 = fn1;
})(Top || (Top = {}));
(function (Top) {
    let Nested;
    (function (Nested) {
        function fn2() { }
        Nested.fn2 = fn2;
    })(Nested = Top.Nested || (Top.Nested = {}));
})(Top || (Top = {}));
(function (Top) {
    var Nested;
    (function (Nested) {
        function fn3() { }
        Nested.fn3 = fn3;
    })(Nested = Top.Nested || (Top.Nested = {}));
})(Top || (Top = {}));
var Enum;
(function (Enum) {
    Enum[Enum["a"] = 0] = "a";
    Enum[Enum["b"] = 1] = "b";
})(Enum || (Enum = {}));

which after minification and some formatting becomes

let n
let t
;(function (n) {
  n.fn1 = function () {}
})(n || (n = {}))
;(function (n) {
  let t
  !(function (n) {
    n.fn2 = function () {}
  })((t = n.Nested || (n.Nested = {})))
})(n || (n = {}))
;(function (n) {
  ;(n.Nested || (n.Nested = {})).fn3 = function () {}
})(n || (n = {}))
;(function (n) {
  n[(n.a = 0)] = 'a'
  n[(n.b = 1)] = 'b'
})(t || (t = {}))

Instead, it could be compiled to this:

"use strict";
var Top = Top || {};
Top.Nested ||= {};
(function () {
    function fn1() { }
    Top.fn1 = fn1;
})();
(function () {
    var Nested = Top.Nested;
    (function () {
        function fn2() { }
        Nested.fn2 = fn2;
    })();
})();
(function () {
    var Nested = Top.Nested;
    (function () {
        function fn3() { }
        Nested.fn3 = fn3;
    })();
})();
var Enum = Enum || {};
(function () {
  Enum[Enum["a"] = 0] = "a";
  Enum[Enum["b"] = 1] = "b";
})();

which after minification and some manual formatting becomes:

var Top = Top || {}
Top.Nested ||= {}
Top.fn1 = function () {}
Top.Nested.fn2 = function () {}
Top.Nested.fn3 = function () {}
var Enum = Enum || {}
Enum[0] = 'a'
Enum.a = 0
Enum[1] = 'b'
Enum.b = 1

I understand that TypeScript has to handle module and enum augmentations. But TypeScript also has the concept of module mode vs script mode. In module mode, it could do further optimizations, such as turning the enum into an object literal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting More Feedback This means we'd like to hear from more people who would be helped by this feature Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests

6 participants