New well-known Symbol: Symbol.typeofTag

Hi,

I would like to propose the addition of a new well-known Symbol: Symbol.typeofTag.
This serves as a continuation of the discussion originally brought forward by Randy Buchholz here.

Motivation

Most recently with BigInt we've seen changes to the behavior of the typeof operator. Looking at the past 5 or so years of standardization efforts, we've seen a total of two new primitive types (symbol and bigint), and based on this history it is not unlikely that additional primitive Data Types and Values will be standardized and implemented in the future.

We've already seen tools like Babel transform typeof UnaryExpressions into calls to a special helper function, _typeof, that checks if the value if a polyfilled Symbol and makes sure to return "symbol" rather than "function" for it. This helper may be extended in the future to also handle bigint, and other potential future primitive values. This is a user-land, syntax transformation-powered way to work around a limitation which is that there's no way to extend the runtime semantics of the typeof operator. It can be avoided entirely with a Well-Known symbol such as Symbol.typeofTag which polyfill writers may rely on for ensuring compatibility and test262 compliance.

Looking at the specification as it is today, the InstanceOfOperator (12.10.4) uses the Well-Known symbol Symbol.hasInstance to determine if a constructor object recognizes an object as one of the constructor's instances. This allows for tapping into the behavior of instanceof. Because of this, I would argue that it makes just as much sense to extend the runtime semantics of the typeof operator (12.5.5) to check for a handler for the Well-Known symbol Symbol.typeofTag and call it if it exists, and otherwise fall back to Table 37. This would be backwards-compatible and wouldn't "break the web" .

Proposed Changes

Specifically, I suggest that the runtime semantics of the typeof operator (12.5.5) would change to the following:

1. Let val be the result of evaluating UnaryExpression.
2. If Type(val) is Reference, then
a. If IsUnresolvableReference(val) is true, return "undefined".
3. Set val to ? GetValue(val).
4. Let tag be ? Get(val, @@typeofTag).
5. If tag is not undefined, then
    a. Return ! ToString(tag).
6. Return a String according to Table 37.

Table 1 containing the Well-Known Symbols (6.1.5.1) would be extended with a new row:

Specification Name [[Description]] Value and Purpose
@@typeofTag "Symbol.typeofTag" A String valued property that is accessed with the typeof operator.

Is there a use case that's not solvable via instanceof?

Giving typeof a well-known symbol hook will have a very real performance regression, and there will be security issues tied to it.

Symbol was a bit of a special case, because it is both a new primitive type and it has its own property namespace in objects. So, typeof sym checks are common. BigInt doesn't have this same property namespace (obj[1n] === obj[1]), so I don't think it'll be as common.

1 Like

Just as a heads up, I don't expect this to be popular. I recall reading in the notes multiple times and also multiple times in es-discuss TC39 members (especially @ljharb IIRC) being highly skeptical of this, specifically wanting it to remain unforgeable.

Also, allowing it to be a string would be a baaaaddddd idea, since it'd restrict the committee's ability to further extend the list of known primitive types (like typeof 1n === "bigint") in the future.

To give a concrete example of that perf regression @jridgewell mentioned, typeof x === "string" would have to change from IsNumber(x) to IsNumber(x) || IsNonNullish(x) && Equals(Get(x, @@toStringTag), "string"), a very non-trivial check.

1 Like

@jridgewell In terms of use cases I did outline at least one in the proposal: being able to properly shim the current and future potential semantics of the typeof operator for purposes of polyfilling. As described in the proposal, I've made the observation that tools like Babel apply syntax transformations ahead of time to rewrite usage of typeof into a helper function to work around not being able to alter the runtime semantics such as is otherwise possible with other well-known symbols. It is simply about providing a hook into the typeof operator such as has been done for other operators, including instanceof.

I'm interested in learning more about the security concerns you are having? Can you elaborate?

@isiahmeadows the only additions to the current runtime semantics of typeof that I'm suggesting would be steps 4 and 5 in my proposed changes to section 12.5.5. Naturally there's a performance cost in the extra call to the Get abstract operation, but based on past decisions, it's an additional cost we've decided to pay for other operators such as instanceof with Symbol.hasInstance.

I do understand your concern about allowing it to be a string in terms of restricting the committee from revisiting the semantics in the future.

The thing I'm hoping to achieve is to make the language better suited for shimming hypothetical not-yet-proposed future data types (such as bigint was not that long ago) in a spec conforming way to improve backwards compatibility down the line and reduce the future need for more compile-time magic such as already being done today with Babel, as in my example.

I think the performance regressions would be much too severe for the proposed benefits, as mentioned above. typeof operations are much more common that instanceof operations, in my experience, and are also much cheaper.

1 Like

The babel transform shows this as already possible, though admittedly clunky.

I'm interested in learning more about the security concerns you are having? Can you elaborate?

For instance, checking if a value is an object (and can have special toString behavior) before putting it into innerHTML (Trusted Types). Or, checking if it's a primitive before putting it into a Map or Object (Map polyfills).

instanceof was already a very slow operator, so making it slower wasn't much of a loss. typeof is currently very fast, so any regression will be noticeable. It's also much more common, meaning the extra checks could add up.

1 Like

Currently, such an IsNumber operation, like most typeof value === "..." operations, doesn't require a single memory load in SpiderMonkey (it's all just inspecting the raw pointer itself - they use NaN boxing) and only a single one in V8. They all hardcode this. A Get operation could require over a dozen depending on how deep the prototype chain is and if there's a proxy target. And this isn't even accounting for the fact the cached type info for the object may very well be megamorphic and so it bails out.

Contrast this with instanceof, where Symbol.hasInstance is called on the type, not the object. This is much easier to cache and optimize, since all the polymorphism is normally in the object of object instanceof Type, while it's Type[Symbol.hasInstance](object) that's being looked up called. So no megamorphic method lookups here and no performance cliff.

Also, instanceof was technically unreliable anyways - Array.isArray(Object.create(Array.prototype)) returns false, while Object.create(Array.prototype) instanceof Array returns true. In fact, that's why the Array.isArray function came into fruition in the first place in ES5, to provide a way to unforgeably check if something was an array.

Thanks for your input,

It would seem that the added performance cost, coupled with concerns about this leading to restricting the typeof operator from future changes for web compat reasons is holding back this proposal.

Both are very valid arguments I think.

Here's what I'm trying to achieve: To make it possible to pass as close to 100% of the test262 conformance tests in an implementation of the Ecma 2020 language spec in the specified language to provide very standards-compliant polyfills. In trying to do so, I've identified only few limitations so far - the Global Symbol Registry that is shared across realms being one, and the semantics of typeof being another.

While I realize that this is a very specific, niche use-case, ECMAScript is unique in that it is embedded in an ever-evolving web platform where at least I think I'm seeing multiple standards bodies move toward opening up more internals for extendability and, let's call it "polyfillability". We're seeing this movement with things like the work on CSS Houdini and the work on Custom Elements (specifically ElementInternals, for example seen here and here).

I believe it is an important to consider how ECMAScript as a language has the low-level building blocks necessary to ensure backwards compatibility when new functionality comes to the language (which well-known symbols is a very good example of), and the typeof operator is one of such primitives that is, by design, unforgeable and not extensible

I understand why, based on your arguments, even though it is unfortunate. A substantial group of Web developers are using Babel, either directly or indirectly, and a large part of this group is already paying a far higher price for their typeof usage due to the said _typeof helper and thus not reaping the benefits of any engine-level optimizations of typeof.

You can create such global symbols with Symbol.for, albeit not completely transparently. Most polyfills for new symbols use this.

You can trivially avoid this by using @babel/preset-env and serving a separate bundle to IE and other browsers. (You can do this via some very basic UA sniffing to detect IE.) And in practice, the _typeof helper is only necessary when you're either doing a bare typeof value, if you're explicitly doing typeof value === "symbol", typeof value === "bigint", or similar, or if you're doing typeof value === "object" (where it has to exclude those other exceptions). It doesn't appear when you do typeof value === "number" or typeof value === "string", for instance, so you're not paying for this for all primitive types or even most.

I'm referring to the Global Symbol Registry not being polyfillable. If Symbol.for is present in the engine, so is the Global Symbol Registry. But without that, there's no way to make well-known symbols or symbols created with Symbol.for referentially equal across realms. All I'm saying here is that because of that, a symbol polyfill will never be able to pass all test262 tests.

Now, there's nothing we can do about that, and that's OK! This proposal is concerned with gearing the platform to be able to provide even better polyfills in the future for future APIs.

Symbols really pave the way to that. With them, we can tap into internal runtime semantics and even share values across realms. But one of the only few primitives we can't tap into is typeof, hence the proposal.

I'm well aware of this, and sincerely hope Web developers at large is too, though I'll add that I believe a lot of teams treat legacy browsers as their baseline and unfortunately don't do differential bundling and serving. It doesn't help either that a lot of teams rely on tools that doesn't improve on this at all, and doesn't help push and educate on best practices.

But that's another discussion :slightly_smiling_face:

I'm only mentioning babel to point to a well-known workaround in the ecosystem that requires a build step and exists only to support users using a Symbol polyfill, which definitely feels like a hack, but a hack with no alternative since the engine doesn't allow for it. Which is why I'm making this proposal.

Now, I can see why this probably won't fly due to performance concerns or concerns about compat issues in the future if a popular site for who-knows-what reason starts to manipulate this.

1 Like

For the record, BigDecimal, currently a stage 0 proposal, may eventually be standardized and could presumably become a new primitive typeof value that cannot be shimmed and may - potentially - hold back adoption because of it. That is what this proposal is trying to address.

It didn’t hold back adoption of symbols, and likely won’t for BigInt or BigDecimal or Records and Tuples.

It's hard to say to what degree, if any, this concern of "polyfillability" has held back BigInt so far, but I think it's fair to argue that one strong reason why symbol enjoys wide use is that popular tooling and libraries has gotten around to extending their type checks with an additional test for polyfilled symbols, even to the point of syntax transformation such as with babel as I mentioned.

As with anything, we may think it's an acceptable limitation and that workarounds in user-land is the way to go. I would hope we could find a way to balance extensibility with constraints that meets the performance and compatibility concerns. But the concerns raised so far indicate resistance to this proposal, so I don't see that happening, unfortunately.