Skip to content
This repository has been archived by the owner on Jan 25, 2022. It is now read-only.

Could this be done with "private symbols" instead? #115

Closed
dead-claudia opened this issue Jul 29, 2018 · 40 comments
Closed

Could this be done with "private symbols" instead? #115

dead-claudia opened this issue Jul 29, 2018 · 40 comments

Comments

@dead-claudia
Copy link

dead-claudia commented Jul 29, 2018

Edit: This proposal lives here now. I'm leaving the original proposal below for archival purposes, but you should refer to here for the latest proposal text. Also, if you find a typo, have questions, or anything similar that's too specific to be here, feel free to file an issue there.

Original proposal *Yes, I know I'm proposing a variant of [something that's been proposed before](https://esdiscuss.org/topic/proposal-about-private-symbol).*

The basic idea here is that instead of explicitly creating weak maps, using symbols you can't enumerate. It's also mostly polyfillable (fully if leaks are allowed), and it requires few changes to the spec and runtimes to support. Something like this might work:

  • Symbol.private(desc?) - Create a local private symbol with the optional given description.
  • Symbol.isPrivate(sym) - Check if a symbol is private.
  • Private symbol reads and writes ignore proxies, delegating to their target instead.
  • Object.getOwnPropertySymbols() and friends ignore private symbols. There is no way to enumerate them without storing them in your own data structure.
    • The [[OwnPropertyKeys]] invariant would be modified to mandate that the returned list must not include any private symbols, and ownKeys would be modified accordingly.
    • Private symbols' descriptors' [[Enumerable]] field is just ignored.
  • Private symbols are otherwise like any other symbol. Whether a symbol is private is more or less like a flag, and you can still query for their existence normally.
  • As a follow-up, I propose that most uses of internal slots in the spec should be modified to use well-known private symbols instead, so they would no longer require membranes.

There are several perks to this:

  1. It is mostly polyfillable, even up to and including proxy support.
  2. This avoids the need for membranes in most cases.
  3. Support for things like decorators, private methods, static private fields, private object literal fields, etc. naturally fall out of the grid without requiring any special support.
  4. Support for private expando properties are supported, useful for things like stateful mixins.
  5. Privacy control is as simple as "do you export it". If you want to restrict a symbol to specific types, you have to do it yourself.
  6. It largely reuses the same pipeline engines already use to optimize symbol property accesses, so it'll be fast from the gate.

And of course, there are cons:

  1. If you attempt to read or write a private symbol on a field that shouldn't have it, nobody is stopping you. (This is a direct result of supporting private expando properties.)
  2. You don't easily get "private", "protected", or "friend". You only get "public", "hard private", or "soft private".
  3. The syntax still uses that blasted dynamic lookup syntax. For private method calls, it looks much weirder than the current proposal.
    • This is part of why I created the follow-on proposal below. It still uses this behind the scenes, but it's to this as async/await is to promises - it takes most of the grief out of the common case, while still letting you dive deep when you need to.
Examples

Here's the counter example from the private methods proposal, adapted to use private symbols.

const _x = Symbol.private("x")
const _xValue = Symbol.private("xValue")
const _render = Symbol.private("render")

class Counter extends HTMLElement {
    [_xValue] = 0

    onclick = () => {
        this[_x]++
    }

    get [_x]() { return this[_xValue] }
    set [_x](value) {
        this[_xValue] = value
        window.requestAnimationFrame(() => this[_render]())
    }

    connectedCallback() {
        this[_render]()
    }

    [_render]() {
        this.textContent = this[_x].toString()
    }
}
window.customElements.define("num-counter", Counter)

If you want to emulate the existing class field proposal, you can create a wrapper to check the object appropriately:

// Wrapper function
function newPrivateSet(name, ...fields) {
    const symbols = fields.map(key => Symbol.private(key))
    const symbolTable = Object.create(null)

    for (let i = 0; i < fields.length; i++) {
        symbolTable[fields[i]] = symbols[i]
    }

    Object.seal(symbolTable)

    class Handler {
        constructor(target) {
            this.target = target
        }

        maybeResolve(key) {
            const sym = symbolTable[key]

            if (sym == null) {
                throw new TypeError(`\`${key}\` does not exist in ${name}!`)
            }

            return sym in this.target ? sym : undefined
        }

        resolve(key) {
            const sym = this.maybeResolve(key)

            if (sym == null) {
                throw new TypeError(`\`this\` is not a ${name}!`)
            }

            return sym
        }

        has(_, key) {
            return this.maybeResolve(key) != null
        }

        get(_, key, receiver) {
            return Reflect.get(this.target, this.resolve(key), receiver)
        }

        set(_, key, value, receiver) {
            return Reflect.set(
                this.target, this.resolve(key), value, receiver
            )
        }

        getOwnPropertyDescriptor(_, key) {
            const sym = this.maybeResolve(key)
            let desc

            if (sym != null) {
                desc = Reflect.getOwnPropertyDescriptor(this.target, sym)
                if (desc != null) desc.configurable = false
            }

            return desc
        }

        defineProperty(_, key, desc) {
            const sym = this.maybeResolve(key)

            return sym != null && !desc.configurable && desc.writable &&
                Reflect.defineProperty(this.target, sym, desc)
        }
    }

    return {
        data: inst => new Proxy(symbolTable, new Handler(inst)),
        makeParent: Super => class Parent extends Super {
            constructor(...args) {
                super(...args)
                for (const sym of symbols) this[sym] = undefined
            }
        },
    }
}

// Example
const {data, makeParent} = newPrivateSet("Point", "x", "y")

class Point extends makeParent() {
    constructor(x, y) {
        data(this).x = x
        data(this).y = y
    }

    get x() { return data(this).x }
    get y() { return data(this).y }

    toString() {
        return `Point(${data(this).x}, ${data(this).y})`
    }
}

// What the class fields proposal would have you do currently:
class Point {
    #x, #y

    constructor(x, y) {
        this.#x = x
        this.#y = y
    }

    get x() { return this.#x }
    get y() { return this.#y }

    toString() {
        return `Point(${this.#x}, ${this.#y})`
    }
}

// What the abstract refs proposal would have you do currently:
const data = new WeakMap()

class Point {
    constructor(x, y) {
        this::data = {x, y}
    }

    get x() { return this::data.x }
    get y() { return this::data.y }

    toString() {
        return `Point(${this::data.x}, ${this::data.y})`
    }
}
A follow-on proposal for syntax sugar

A follow-on proposal could add syntax sugar for private symbols, detailed in the code snippet below. Consider it to this as async/await was to promises. There are perks to making it pure syntax sugar:

  1. Decorator semantics are incredibly obvious. There's nothing to design for - literally nothing.
  2. It's more optimizable, and engines could choose to implement it in a way that doesn't actually involve private symbols. This is similar to how async/await doesn't require an engine to even use promises in the middle - it's all just callbacks and microtasks internally.
  3. In the more arcane scenarios, like a method being passed by reference or with inherited fields through proxies, there's nothing extra to spec. You just need to test that engines do the right thing in each of those cases (and that they don't optimize away more than they should).
// Normal
const _x = Symbol.private("x")
const _xValue = Symbol.private("xValue")
const _render = Symbol.private("render")

class Counter extends HTMLElement {
    [_xValue] = 0

    onclick = () => {
        this[_x]++
    }

    get [_x]() { return this[_xValue] }
    set [_x](value) {
        this[_xValue] = value
        window.requestAnimationFrame(() => this[_render]())
    }

    connectedCallback() {
        this[_render]()
    }

    [_render]() {
        this.textContent = this[_x].toString()
    }
}
window.customElements.define("num-counter", Counter)

const _x = Symbol.private("x")
const _y = Symbol.private("y")

class Point {
    constructor(x, y) {
        this[_x] = x
        this[_y] = y
    }

    get x() { return this[_x] }
    get y() { return this[_y] }

    toString() {
        return `Point(${this[_x]}, ${this[_y]})`
    }
}

// Sugar
class Counter extends HTMLElement {
    #xValue = 0

    onclick = () => {
        this.#x++
    }

    get #x() { return this.#xValue }
    set #x(value) {
        this.#xValue = value
        window.requestAnimationFrame(() => this.#render())
    }

    connectedCallback() {
        this.#render()
    }

    #render() {
        this.textContent = this.#x.toString()
    }
}
window.customElements.define("num-counter", Counter)

class Point {
    #x, #y

    constructor(x, y) {
        this.#x = x
        this.#y = y
    }

    get x() { return this.#x }
    get y() { return this.#y }

    toString() {
        return `Point(${this.#x}, ${this.#y})`
    }
}

// Transpiled sugar, unoptimized
// Note: `_vars` aren't actually exposed here.
const _Object$defineProperty$ = Object.defineProperty

function _lazySet$(inst, key, value) {
    _Object$defineProperty$(inst, key, {
        configurable: true, enumerable: true, writable: true, value,
    })
}

function _checkFactory$(type, name, tag) {
    return inst => {
        if (tag in inst) return inst
        throw new TypeError(`\`${type}\` is not an instance of \`${name}\`!`)
    }
}

const _brand$Counter$ = Symbol.private("Counter tag")
const _staticBrand$Counter$ = Symbol.private("Counter static tag")
const _sym$Counter$x$ = Symbol.private("Counter.#x")
const _sym$Counter$xValue$ = Symbol.private("Counter.#xValue")
const _sym$Counter$render$ = Symbol.private("Counter.#render")
const _check$Counter$this$ = _checkFactory$("Counter", "this", _brand$Counter$)

class Counter extends HTMLElement {
    static get [_staticBrand$Counter$]() {}

    constructor(...args) {
        super(...args)
        this[_brand$Counter$] = undefined
        this[_sym$Counter$xValue$] = 0
    }

    onclick = () => {
        _check$Counter$this$(this)[_sym$Counter$x$]++
    }

    // Yes, `this` doesn't have to be an instance of the class in getters,
    // setters, and methods. The brand checks just need to exist when you use
    // other private slots.
    get [_sym$Counter$x$]() {
        return _check$Counter$this$(this)[_sym$Counter$xValue$]
    }
    set [_sym$Counter$x$](value) {
        _check$Counter$this$(this)[_sym$Counter$xValue$] = value
        window.requestAnimationFrame(() =>
            _check$Counter$this$(this)[_sym$Counter$render$]()
        )
    }

    connectedCallback() {
        _check$Counter$this$(this)[_sym$Counter$render$]()
    }

    [_sym$Counter$render$]() {
        this.textContent = _check$Counter$this$(this)[_sym$Counter$x$].toString()
    }
}
window.customElements.define("num-counter", Counter)

const _brand$Point$ = Symbol.private("Point tag")
const _staticBrand$Point$ = Symbol.private("Point static tag")
const _sym$Point$x$ = Symbol.private("Point.#x")
const _sym$Point$y$ = Symbol.private("Point.#y")
const _check$Point$this$ = _checkFactory$("Point", "this", _brand$Point$)

class Point {
    static get [_staticBrand$Point$]() {}

    constructor(x, y) {
        this[_brand$Point$] = undefined
        _check$Point$this$(this)[_sym$Point$x$] = x
        _check$Point$this$(this)[_sym$Point$y$] = y
    }

    get x() { return _check$Point$this$(this)[_sym$Point$x$] }
    get y() { return _check$Point$this$(this)[_sym$Point$y$] }

    toString() {
        return `Point(${
            _check$Point$this$(this)[_sym$Point$x$]
        }, ${
            _check$Point$this$(this)[_sym$Point$y$]
        })`
    }
}

// Transpiled sugar, optimized
// Note: `_vars` aren't actually exposed here.
const _Object$defineProperty$ = Object.defineProperty

function _lazySet$(inst, key, value) {
    _Object$defineProperty$(inst, key, {
        configurable: true, enumerable: true, writable: true, value,
    })
}

function _checkFactory$(type, name, tag) {
    return inst => {
        if (tag in inst) return inst
        throw new TypeError(`\`${type}\` is not an instance of \`${name}\`!`)
    }
}

const _brand$Counter$ = Symbol.private("Counter tag")
const _sym$Counter$xValue$ = Symbol.private("Counter.#x")
const _check$Counter$this$ = _checkFactory$("Counter", "this", _brand$Counter$)

function _genGet$Counter$x(_this) {
    return _this[_sym$Counter$xValue$]
}

function _genSet$Counter$x(_this, value) {
    _this[_sym$Counter$xValue$] = value
    window.requestAnimationFrame(() => _genMethod$Counter$render(_this))
}

function _genMethod$Counter$render(_this) {
    _this.textContent = _genGet$Counter$x(_this).toString()
}

class Counter extends HTMLElement {
    constructor(...args) {
        super(...args)
        this[_brand$Counter$] = undefined
        this[_sym$Counter$xValue$] = 0
    }

    onclick = () => {
        _check$Counter$this$(this)
        _genSet$Counter$x(this, _genGet$Counter$x(this) + 1)
    }

    connectedCallback() {
        _genMethod$Counter$render(_check$Counter$this$(this))
    }
}
window.customElements.define("num-counter", Counter)

const _brand$Point$ = Symbol.private("Point tag")
const _sym$Point$x$ = Symbol.private("Point.#x")
const _sym$Point$y$ = Symbol.private("Point.#y")
const _check$Point$this$ = _checkFactory$("Point", "this", _brand$Point$)

class Point {
    constructor(x, y) {
        this[_brand$Point$] = undefined
        this[_sym$Point$x$] = x
        this[_sym$Point$y$] = y
    }

    get x() { return _check$Point$this$(this)[_sym$Point$x$] }
    get y() { return _check$Point$this$(this)[_sym$Point$y$] }

    toString() {
        _check$Point$this$(this)
        return `Point(${this[_sym$Point$x$]}, ${this[_sym$Point$y$]})`
    }
}

Note that with this sugar, you can't combine both this.#xValue and this[xValue] - you can only use one or the other, since the names are generated uniquely per-name, per-scope. If you need to expose them to subclasses or friend classes, you should use normal private symbols and export them instead. Additionally, you can't access the private symbols' descriptors using this sugar - you also need to use normal private symbols if you wish to do that.

There are a few optimizations you can make to the transpiler output, as I demonstrated above:

  • Instead of checking multiple times in a single code path, you can wait until the first observable access in each one and compress them all.
  • If a getter, setter, or method is private and never accessed directly (only possible with methods), you can factor them out as functions rather than reifying them as actual properties. Also, within these getters, setters, and methods, you don't need to check at all before accessing private properties, since you couldn't get there in the first place without such a check.
  • You can omit tags and fields that aren't used. They're not observable, so there's no need to keep them unless you need to check them.
And, a partial polyfill.

There are two caveats with this polyfill:

  1. This technically leaks. I can't avoid it without transpiling every dynamic non-literal property access private symbols could touch, because I can't use symbols as weak map keys.
  2. This will throw a TypeError any time ownKeys is called on a frozen object with a private symbol. Avoiding this is extremely non-trivial because the invariant for that method needs modified to exclude private keys from the list it checks against, but because the spec doesn't call ownKeys implicitly from syntax exposing symbols returned from it, I could just overwrite all the methods that delegate to it. Of course, this would double or even triple the size of the polyfill below, so I ignored it.
// Note: this technically leaks because I can't use symbols as weak map keys
(function (global) {
    "use strict"

    if (typeof Symbol.private === "function") return

    const getOwnPropertySymbols = Object.getOwnPropertySymbols
    const ownKeys = Reflect.ownKeys
    const OldProxy = Proxy
    const OldSymbol = Symbol
    const data = new Set()
    const markPrivate = data.add.bind(data)
    const isPrivate = data.has.bind(data)
    const call = Function.call.bind(Function.call)
    const objectCreate = Object.create
    const arrayFrom = Array.from
    const arrayConcat = Array.prototype.concat.bind([])
    const hooks = objectCreate(null)
    const lengthRef = objectCreate(null)

    function removePrivateKeys(array) {
        let count = 0
        for (let i = 0; i < array.length; i++) {
            if (!isPrivate(array[i])) array[count++] = array[i]
        }
        array.length = count
        return array
    }

    /** @this {keys} */
    function filterKey(_, i) {
        const key = this[i]

        if (isPrivate(key)) {
            throw new TypeError(
                "`ownKeys` on proxy: trap returned a private symbol as a key"
            )
        }

        return key
    }

    function filterProxy(handler) {
        function forward(method) {
            return function () {
                const body = handler[method]
                if (body != null) return call(body, handler, ...arguments)
                return hooks[method](...arguments)
            }
        }

        function filter(method) {
            return function (target, key) {
                if (!isPrivate(key)) {
                    const body = handler[method]
                    if (body != null) return call(body, handler, ...arguments)
                }
                return hooks[method](...arguments)
            }
        }

        return {
            apply: forward("apply"),
            construct: forward("construct"),
            isExtensible: forward("isExtensible"),
            preventExtensions: forward("preventExtensions"),
            getPrototypeOf: forward("getPrototypeOf"),
            setPrototypeOf: forward("setPrototypeOf"),
            defineProperty: filter("defineProperty"),
            deleteProperty: filter("deleteProperty"),
            get: filter("get"),
            getOwnPropertyDescriptor: filter("getOwnPropertyDescriptor"),
            has: filter("has"),
            set: filter("set"),

            ownKeys(target) {
                let body = handler.ownKeys

                if (body == null) return ownKeys(target)
                const keys = call(body, handler, target)

                lengthRef.length = keys.length
                return arrayFrom(lengthRef, filterKey, keys)
            },
        }
    }

    function defineMethods(root, methods) {
        for (const key of Object.keys(methods)) {
            Object.defineProperty(root, key, {
                configurable: true, enumerable: false, writable: true,
                value: methods[key],
            })
        }
    }

    defineMethods(Symbol, {
        private(name) {
            const sym = OldSymbol(name)
            markPrivate(sym)
            return sym
        },

        isPrivate(sym) {
            if (typeof sym === "symbol") return isPrivate(symbol)
            throw new TypeError("`sym` is not a symbol!")
        },
    })

    defineMethods(Object, {
        getOwnPropertySymbols(object) {
            return removePrivateKeys(getOwnPropertySymbols(object))
        },
    })

    defineMethods(Reflect, {
        ownKeys(target) {
            return removePrivateKeys(ownKeys(target))
        },
    })

    for (const key of Object.getOwnPropertyNames(Reflect)) {
        hooks[key] = Reflect[key]
    }

    global.Proxy = class Proxy {
        constructor(target, handler) {
            return new OldProxy(target, filterProxy(handler))
        }

        static revocable(target, handler) {
            return OldProxy.revocable(target, filterProxy(handler))
        }
    }
})(this)

I know this seems a lot like what was proposed before, but I do feel it's sufficiently different, and it does address most of the various pitfalls in the other past "private symbol" proposals:

  • Semantically, it operates more like weak maps than it does a standard property. In fact, converting from weak maps to private symbols is 100% equivalent assuming 1. the WeakMap methods are unmodified, and 2. proxies aren't involved (which change the this value).
  • It does forward through proxies, but it critically does not allow proxy hooks to observe their existence. This avoids the issue of a "set" on one side not necessarily reflecting a "get" on the other side of a membrane.
@bakkot
Copy link
Contributor

bakkot commented Jul 29, 2018

I might have more to say later, but a quick note while I'm thinking of it:

Most uses of internal slots in the spec would be modified to use well-known private symbols instead

It does forward through proxies and the prototype chain

These two statements aren't compatible: Map.prototype.has.call({__proto__: new Map}, 'a') throws.

@dead-claudia
Copy link
Author

@bakkot

Most uses of internal slots in the spec would be modified to use well-known private symbols instead

Read this statement a little more closely. 😉 (Emphasis mine just now)

I'm proposing to change that behavior as well, just as a follow-on. I'll edit to clarify.

@bathos
Copy link

bathos commented Jul 30, 2018

As someone who has tried to convert existing codebases that are using WeakMap privacy to use the stage 3 private fields proposal instead, I’d like to share that

  • ultimately, I wasn’t able to — the existing proposal cannot express everything that WeakMap can
  • this proposal would address every problem I encountered

I am confused about prototype involvement here for reasons related to what bakkot said. However forwarding through proxies (across the board, even for intrinsic and host APIs) would be a godsend.

One concern about the proposed API: global.Symbol.private() has the same forgeability issues as global.WeakMap, WeakMap.prototype, etc. I would prefer to see a syntactic solution for private symbol creation, so that one can be certain the key is really a private symbol. A natural choice, imo, would be a pseudo-call like const myKey = private();

@dead-claudia
Copy link
Author

@bathos

  1. It's otherwise a normal symbol property, so the same prototype involvement applies. I removed the clause, so it's a little clearer.
  2. I know it's forgeable, but in my experience, forging just isn't a major concern. The potential for forgery only exists when creating the fields (before/during class initialization, for example), not when creating object instances or executing methods. Plus, the syntactic sugar I suggested be built on top of this wouldn't be forgeable - it'd be like async/await in that regard (you can't modify the promise being returned).

@bathos
Copy link

bathos commented Jul 30, 2018

@isiahmeadows that seems reasonable to me. I understand where you’re coming from with inherited properties now. I had imagined this as bypassing ordinary lookup.

A minor thing to note is that existing slots work cross-realm. Redefining intrinsic/host slots in terms of private symbols does not prevent this from remaining true, but it is a case where user-created private symbols would differ in capability (as Symbol.for defeats the point of privacy ... and just about everything else symbols offer).

@dead-claudia
Copy link
Author

@bathos I considered a private equivalent for Symbol.for and initially had it prior to filing, but I wasn't 100% sure how useful it'd be outside polyfills, especially since they're supposed to be "private". But yes, the intent for the builtins was to use cross-realm private symbols.

@littledan
Copy link
Member

@bathos Can you say more about the issues you ran into with the Stage 3 proposal?

@bathos
Copy link

bathos commented Jul 30, 2018

@littledan, I’ll see if I can provide a better summary later if it helps, but for now here’s some related posts on esdiscuss where I described the issues I ran into when attempting conversion:

https://esdiscuss.org/topic/proposal-object-members#content-30
https://esdiscuss.org/topic/proposal-object-members#content-33
https://esdiscuss.org/topic/proposal-object-members#content-35

the tl;dr is that today’s scope-based hard privacy has served me well when dealing with APIs composed of interconnected objects because I had full control over visibility — the issues with WeakMap chiefly concerned ergonomics rather than functionality. When attempting conversion, some things that were easy with WeakMaps became strange and verbose, and some things weren’t possible at all. Eventually I gave up and returned to WeakMaps.

@tvcutsem
Copy link

tvcutsem commented Aug 4, 2018

@isiahmeadows I like the general direction of this proposal a lot, but the tunnelling-through-proxies behavior has me worried that private symbols can't be used for reliable brands-checks, diminishing the value of "hard private" fields.

Consider:

// Alice writes:
let brand = Symbol.private()

export function createBrandedObject(v) {
  return Object.freeze({ [brand]: true, x: v, m() { return this.x * 2 } });
}

export function useBrandedObject(obj) {
  if (obj[brand]) { // brands-check
    // assuming obj is a branded object, Alice would never expect this to fail
    assert(obj.x * 2 === obj.m())
  }
  throw new TypeError("illegal argument")
}

// Eve writes:
let target = alice.createBrandedObject(24)
let proxy = new Proxy(target, {
  get(t, name, r) {
    return (name === "m") ? ( () => 0 ): Reflect.get(t, name, r);
  }
})
alice.useBrandedObject(proxy)
// Eve fools Alice into thinking proxy is a branded object
// the assertion fails (42 === 0), violating Alice's assumptions/invariants
// in the general case, an attacker can make use of Alice's confusion

attn @erights

@bathos
Copy link

bathos commented Aug 4, 2018

@tvcutsem that’s a good point. As much as I wish Proxies proxied brandedness — it’d sure make them more useful to everybody but Alice — those assurances would be lost. This is also an issue for using standard prototype chain property lookup as proposed above.

How would people feel about a more conservative variation where:

  • OrdinaryGet does not recurse if the key is a private symbol;
  • proxy behavior is not adjusted other than what is necessary to prevent observation of private keys in keeping with changes to Reflect.ownKeys, Object.getOwnPropertySymbols, etc; thus, Proxy [[Get]] etc would bail for OrdinaryGet if the key is a private symbol

Speaking for myself, even with these adjustments, this behavior would still address all problems I ran into with private fields. It would match the existing semantics of slots rather than altering them, I think.

@bathos
Copy link

bathos commented Aug 5, 2018

For the sake of illustrating @tvcutsem’s point less abstractly, I’d like to show an example of how this would matter for an existing intrinsic operation, RegExp.prototype.exec:

21.2.5.2 RegExp.prototype.exec(string)

  1. Let R be the this value.
  2. If Type(R) is not Object, throw a TypeError exception.
  3. If R does not have a [[RegExpMatcher]] internal slot, throw a TypeError exception.
  4. Let S be ? ToString(string).
  5. Return ? RegExpBuiltinExec(R, S).

21.2.5.2.2 Runtime Semantics: RegExpBuiltinExec(R, S)

1. Assert: R is an initialized RegExp instance.
[...]
9. Let matcher be R.[[RegExpMatcher]].

If slots were realized as private-symbol-keyed properties that could be inherited and proxied, the invariants here could indeed be violated:

const p = Object.create(/foo/);

p.exec({
  toString() {
    Reflect.setPrototypeOf(p, null);
    return '';
  }
})

Anywhere a slot-existence check as in step 3 of exec is potentially followed by calls to ECMAScript code would present this issue if the slots were inherited (due to mutable [[Prototype]]) or proxied (due to revocable proxies / getPrototypeOf trap).

@dead-claudia
Copy link
Author

@bathos For a concrete example, that would be restructured as follows:

  1. R would become a handle for the regexp's private state object instead, with appropriate prose revised to use that terminology instead.
  2. The [[RegExpMatcher]] object would be moved to that private state object.
  3. Regular expressions would gain a @@internalRegExpState private symbol property to hold that object, where that symbol would not be exposed to ECMAScript code.

That would fix it while still permitting the assertion to hold.

@dead-claudia
Copy link
Author

I left that part implicit within the proposal when I specified proposing using those instead of internal slots. In general, mutable state would have to be referenced like that, but that's more of a practical concern than something purely theoretical.

@bathos
Copy link

bathos commented Aug 5, 2018

That makes sense I think — though it likely means making a lot of changes throughout both the ES spec and other specs that define or reference slots, like HTML. I don’t know if that is seen as problematic or not, but it’s a big touch.

@dead-claudia
Copy link
Author

@bathos It would result in a rather big diff due to the existing pervasive use of internal slots, but the change behaviorally would be minimal and it'd affect almost nobody. The main differences is that fewer TypeErrors would be thrown: the fields would be accessible through proxies and (optionally) prototypes, but that's it. Most of the changes itself are largely algorithmic.

The WebIDL spec would need updated as well as a few places within the HTML spec, but most other specs would be largely unaffected. I'm not TC39, but I don't believe the change is that problematic.

@bathos
Copy link

bathos commented Aug 5, 2018

One additional thought re: tvcutsem’s demonstration + the private state object solution:

A scenario which the private state object doesn’t address is user code that relies on slot-accessing brand checks as a way of testing whether a value is safe for subsequent method invocation. For the category of values for which such checks presently return negative that would now return positive, the invariant that this status will always remain true will be gone; and because this subset of values cannot be distinguished from the existing positive set, such checks would become unreliable in this regard categorically (at least as a technicality).

Contrived, but for example:

const { apply } = Reflect;
const { exec } = RegExp.prototype;
const { get } = Reflect.getOwnPropertyDescriptor(RegExp.prototype, 'global');

function getPatternExecutor(value) {
  try {
    if (apply(get, value, []) !== undefined) return s => apply(exec, value, [ s ]);
  } catch {}
  throw new TypeError(`must be a valid RegExp instance`);
}

Presently, if the s => apply function gets returned successfully, there is a perfect guarantee that its value is and will always be a valid receiver for exec. The private state object doesn’t help here since, unlike in internal methods, ECMAScript code couldn’t get a handle on that private value or make use of it.

Does this matter? It would seem to me that the answer is "probably not", but I’m wary of assuming a change like that is safe too quickly and wanted to provide another example that falls outside of intrinsic/host cases as food for thought.

@tvcutsem
Copy link

tvcutsem commented Aug 6, 2018

@bathos Your scenario above (user code using a brands-check to test whether a value is safe for subsequent method invocation) is a subset of the kinds of implicit assertions/invariants that I wanted to highlight in my (contrived) example.

Does it matter? I empathize with your "probably not" answer: developers generally won't want to break existing code and most uses of proxies will be 'well-behaved'. But when reasoning from a security/defensive programming standpoint, we can't assume all developers will be cooperative. All that matters is that this is a potential exploit for an attacker to use, and what makes it dangerous is that private symbols are advertised to introduce "hard private" state, which proxies can't violate directly (they can't "steal" the private state) but can indirectly (by making an object that passes a brands-check behave in arbitrary ways). @erights can probably say more from his experience in building secure sandboxes why/how this matters in practice.

Here's a thought to fix this: rather than tunnelling private symbol access through proxies, or letting [[Get]] bail for private symbol access on proxies, as you suggested earlier, we could let proxy objects have their own private state. In other words, if sym is a private symbol and proxy is a Proxy object, then proxy[sym] could lookup or update the private symbol on the proxy object itself.

This in effect makes private symbols even more similar to WeakMaps, because a WeakMap also treats a proxy object as a separate key.

Assuming this semantics and revisiting my earlier example, in order for Eve to let the check if (obj[brand]) succeed, she would need to explicitly 'copy' the brand private symbol from the target to the proxy, which she can't, because Alice hasn't exposed the brand symbol. However, in other scenarios where the creator of the proxy object also has access to the private symbol, it would be perfectly legit for the proxy creator to explicitly copy the private symbols onto the proxy.

@dead-claudia
Copy link
Author

dead-claudia commented Aug 6, 2018 via email

@zenparsing
Copy link
Member

It has been helpful for me to separate branding from encapsulation. As @tvcutsem points out, "private symbols" would not be appropriate for branding, although they would be perfectly adequate for encapsulation.

In all of our attempts to model "private state" we've conflated these two concepts. What would the outcome be like if we separated them? WeakSet would provide the branding mechanism and private symbols would provide encapsulation:

// The "branded Point" class

const pointBrand = new WeakSet();
const $x = Symbol.private('x');
const $y = Symbol.private('y');

function brandcheck(obj) {
  if (!pointBrand.has(this)) throw new TypeError();
}

class Point {
  constructor(x, y) {
    this[$x] = x;
    this[$y] = y;
    pointBrand.add(this);
  }

  toString() {
    brandcheck(this);
    return `<${ this[$x] }:${ this[$y] }`;
  }

  add(other) {
    brandcheck(this);
    brandcheck(other);
    return new Point(this[$x] + other[$x], this[$y] + other[$y]);
  }
}

Requiring explicit brand checks at the start of method bodies clears up one of the issues I've always had with the current private fields proposal: it's very easy to accidentally run code before accessing a field and throwing a TypeError. We really want the brand check to happen at the beginning of execution.

@tvcutsem What are your thoughts?

@shannon
Copy link

shannon commented Aug 8, 2018

@zenparsing Thank you. I have been wondering why everyone keeps mentioning brand checking in the context of private fields and I haven't understood why they would be tied so closely together. I would have expected that it could be very easily handled with a decorator if that's something desired.

@littledan
Copy link
Member

cc @erights does this proposal meet the invariants that you have been maintaining for what property access means?

@ljharb
Copy link
Member

ljharb commented Aug 8, 2018

Part of the value of the current stage 3 proposal is that you get brand checks "for free". I don't see what value we get out of removing them from the proposal.

@bathos
Copy link

bathos commented Aug 8, 2018

@ljharb You get them for free only if and when the access happens. In practice, esp. when implementing host API stuff or things meant to follow its patterns closely, the check needs to be immediate and unconditional or else the details of the implementation (not the private state) may become observable.

(This was one of the reasons I returned to WeakMaps — even in cases where there were no functionality limitations preventing me from using #private instance properties, I ended up with the same amount of boilerplate, so it seemed best to just use one tool across the board.)

While private symbols don’t remove that boilerplate either*, I think @zenparsing’s distinction is pretty compelling and the example reads nicely.

* Metaprogramming solutions here are possible but get really hairy, assuming preservation of the class’s lineage is important and you can’t just decorate-by-way-of-subclassing. You have to replace the original constructor with a function constructor and copy the original prototype/static properties in order to "insert" the wm.add(instance, state) "in" the constructor presently if you want to abstract this stuff away.

@ljharb
Copy link
Member

ljharb commented Aug 8, 2018

@bathos sure, you'd need a this.#foo; line at the top of your methods, but that's a lot easier than the resulting weak collection alternative.

@bathos
Copy link

bathos commented Aug 8, 2018

At least in the host examples, to obtain consistent messaging, it’d be

try {
  this.#foo;
} catch {
  throw new TypeError('Illegal invocation');
}

I don’t think that’s better than the weak collection approach, which can be served by a helper function as in zenparsing’s example.

@shannon
Copy link

shannon commented Aug 8, 2018

@ljharb That wouldn't be very explicit and also not very free. You now have to maintain your private variable name in all of your methods. It would be much better code to have something that explicitly brand checks rather than arbitrary field names littered throughout your codebase.

@littledan
Copy link
Member

Let's step back a bit. My understanding of the history here is that @waldemarhorwat proposed that undeclared private field access throw as part of an effort to give private fields better, more reliable behavior than ordinary properties, given our opportunity in this proposal to make a more restricted mechanism, and as part of a general encouragement of static shape.

This decision follows our general invariant that private fields and methods are just like public ones, except that some things "don't work", either with runtime errors or early errors. These semantics could help avoid real bugs! And we've already heard positive feedback in this repository on the change.

The ability to do brand checks may be nice, but I think the semantic stands justified on its own without that usage pattern.

@bathos
Copy link

bathos commented Aug 8, 2018

@littledan I’m curious if you got a chance to check out the posts I linked to earlier. The class-declaration-scoping of the private property proposal was the primary thing that prevented me from switching over. Do you have any suggestions there? Is it the position of the current proposal that such use cases do not need to be served and WeakMap for private state should continue to be the solution there?

@littledan
Copy link
Member

littledan commented Aug 8, 2018

@bathos Apologies for my delay here. My hope is that decorators will provide a useful mechanism for friendship while maintaining strong encapsulation. See https://github.com/tc39/proposal-decorators/blob/master/friend.js for some examples of how this can be done. Would this sort of mechanism work for you?

@bathos
Copy link

bathos commented Aug 9, 2018

@littledan it seems like it might. I’m gonna have to dig in more to understand this though (like, what is the type of the "key" value that decorator observes in PrivateName?)

@littledan
Copy link
Member

I'll be interested to hear your thoughts! The type of a PrivateName key is 'object'.

@Igmat
Copy link

Igmat commented Sep 28, 2018

Part of the value of the current stage 3 proposal is that you get brand checks "for free". I don't see what value we get out of removing them from the proposal.

@ljharb but brand-check is another feature and I don't see ANY reason for combining it with encapsulation if it breaks metaprogramming scenarios, because creating of brand-check mechanism as helper function/decorator/3rd-party package/part of standard library is so easy, that trying to address it with ES syntax change is a HUGE overkill, especially if it requires ANY kind of trade-offs (and current proposal does).

As conclusion, you don't get brand-checks for free, it costs you losing some metaprogramming capabilities.

@rdking
Copy link

rdking commented Oct 2, 2018

@Igmat Truth is, any fully encapsulating private implementation will automatically have "brand-checking" as both a natural side effect and an implementation detail.

@Igmat
Copy link

Igmat commented Oct 2, 2018

@rdking, only if brand-checking is part of your definition of private.
If it's not, we could have fully encapsulated private without brand-checking using for example Symbol.private proposal when privates silently fall through proxies, so couldn't be exposed by outside code.

@rdking
Copy link

rdking commented Oct 3, 2018

@Igmat Brand-checking is just checking for knowledge of something that can't otherwise be known. The symbolic name of a private field is such a brand. Simply accessing a private field is therefore a brand check. It doesn't matter if it's the current proposal, mine, or Symbol.private. Just accessing such a private field constitutes a brand check.

@dead-claudia
Copy link
Author

dead-claudia commented Oct 3, 2018

@rdking @Igmat

There's two things that are normally considered "brand checking":

  1. Testing whether an object has a particular brand.
  2. Asserting whether an object has a particular brand.

The second necessarily requires the first, but the first doesn't obligate the second.

As far as I understand (I'm not part of TC39 itself, or I'd have that "Member" badge in the upper right of every comment I make), people within the committee don't all agree on it being okay to not assert the lack of a brand.

As for dynamically defined private symbols vs statically defined private fields, here's the main difference:

  • Private symbols give you brand checking through property existence, but if you want brand assertions, you have to do it explicitly.
  • Private fields give you brand assertions through explicitly-tagged objects, but if you want brand checks, you have to roll your own try/catch around something that triggers a brand assertion.

Each one can give rise to the other, but there are pros and cons to each:

  • Dynamically defined private symbols are much easier to manipulate at runtime and are much more flexible, but they are slightly more cumbersome to define and much harder to optimize (it has most of the same issues as existing "public" symbols).
    • An implementation might choose to in certain circumstances optimize private symbols to map directly to offsets, resolved during class initialization. If they're retained in only one scope and are used by only one class or object literal type, this is 100% safe.
  • Statically defined private fields are much easier to optimize at runtime (absent inheritance, you can set them to static offsets) and are a bit easier to define, but they are much more rigid and cumbersome to work with in dynamic contexts.
    • An implementation might choose to define private field maps as a public "property" of sorts, so 1. it remains safe through type map change, and 2. subclasses can still add their own without interfering with the parent's private fields. This is not unlike using private symbols, but it's just an internal optimization + workaround.

Keep in mind, the two abstractions are equivalent mathematically, just they have very different strengths in practice. And on top of that, the implementation of each would be closer than what you might expect, thanks to the various edge cases.

Details on how implementations would overlap - hidden for brevity
  • An optimized implementation of private symbols might choose to instead map them directly to offsets, resolved during class/type initialization and tracked on the type map by a weak list of private symbols. These offsets would be used for compressing object storage for free.

    • Adding private symbol properties would just append the private symbol to the list.
    • Removing private symbol properties would remove the private symbol from the list and remove the associated descriptor from the object.
    • Retrieving a property via private symbol would consist of 1. searching the weak list for the symbol and 2. retrieving the symbol's descriptor via index.
    • Private symbol collection would involve updating (unmarked) parent type maps to remove the symbol and (unmarked) parent objects to remove the value at the symbol's old index within the type map.
    • Private symbols added when no previous symbol could be deleted can be safely optimized to direct indexed access via the type map.
    • If all private symbols on an object are configurable/writable, you could cut the size in half.
    • Prototype private symbol-keyed methods don't need fully reified if the private symbol is only set on that one prototype and the symbol and method are never read for any reason other than calling (foo[method](...args) or Reflect.apply(foo[method], null, args)). The code object still needs to exist, but the function itself doesn't.
  • A baseline implementation of private fields might choose to define a per-object list of private field maps, so 1. it remains safe through type map change, and 2. subclasses can still add their own without interfering with the parent's private fields.

    • This would be similar to a private symbol pointing to an object with all the fields.
    • This would be the simplest way to implement this proposal while keeping it fast enough to compete with properties.

Even though they expose different APIs, they would likely work more or less the same way internally:

  • Private symbols and private fields would map directly to static offsets within objects.
  • Each of those properties' offsets are calculated during class/type initialization.
  • The associated data would be stored on each object using each of those offsets.
  • Private symbol-based methods and private field-based methods would work mostly the same. Just the first is more reliant on implementation than the second

@littledan
Copy link
Member

We discussed @zenparsing 's private symbols proposal at the September 2018 TC39 meeting, and concluded that we are not adopting that approach and sticking with the proposal in this repository.

@Igmat
Copy link

Igmat commented Oct 10, 2018

@littledan I've checked your presentation and it has some falsy statements about that everybody agreed with limitations of this proposal.

@rdking
Copy link

rdking commented Oct 10, 2018

@Igmat I think they've trying to make it clear that "everybody" is limited to the current proposal contributors and members of the TC39. It doesn't matter that their reasoning contains logical contradictions as long as that collection of people is willing to accept them.

@littledan
Copy link
Member

@lgmat Are you referring to "Everyone I've talked to sees # as reasonable when I explain it" (from this slide)? I guess the phrasing was a bit rough, but I meant my experiences meeting developers in conferences, etc when there's a chance to discuss a topic in person. I didn't mean that everyone in the world who understands this proposal likes the # syntax; this is clearly not true.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants