Standardizing a JS IR (instead of JS0 & JSSugar).

There has been some hype recently regarding the JS0 & JSSugar proposal. In short, it suggests that browsers should fork the current version of JavaScript to create JS0, which will generally aim to avoid adding new features, and instead develop JSSugar as a new, complimentary standard for implementation by (compliant) transpilers, so you would need a transpiler to access the full JavaScript language (JS0 and JSSugar).

I’d like to propose an alternative, which instead standardizes an IR (intermediate representation) that can efficiently represent a subset of JavaScript.

The IR would contain similar information to what you would find in an AST. However, ASTs tend to be big, and JSON isn’t great for serializing large trees, so we would need a binary file-format, similar to Wasm.

The IR would only include the minimum amount of information required to run the program and generate exceptions.

The browser wouldn’t need to recreate the source, as any lexical information (line numbers, column numbers, token strings etc) are taken from the source language, and the IR would include a source URL, so the browser would have everything it needs to pull up the source file and highlight the error (without any sourcemaps).

A lot of the lexical information could be omitted from the IR. For example, if we store the token lengths, we don’t need the token strings, and we could use an increment-line-number instruction, and omit the line numbers as well.

We can entirely omit lexical information for tokens that will never be the target of an exception, whether generally (due to JavaScript semantics) or in a specific instance (based on static analysis of the source language by its compiler).

There are also general optimizations that would apply here, such as using relative offsets to keep the numbers low, then LEB128 encoding them.

The IR would not support every JavaScript feature, instead aiming at a subset that makes sense as a target for transpilers. In future, it could add features that only makes sense for the IR (without corresponding JavaScript syntax), for example, statically typed functions.

An IR would provide a much nicer target for languages that (currently) transpile to JavaScript, regardless of which source language they start from. They would not require parsing (again), and could be permitted to supply extra information that allows the compiled IR to perform better.

If we assume Google are correct (Vanilla JS is passé, and only really used by “hobbyists”), and we imagine that transpilers had a decent IR they could target, then web development would naturally migrate to using JSIR transpilers and languages. The issues JS0 & JSSugar attempt to address would be resolved, and we’d have much wider scope for improving the Web over time (filling in the gap between JavaScript and WebAssembly).

If you hadn't seen it before: GitHub - tc39/proposal-binary-ast: Binary AST proposal for ECMAScript

3 Likes

Thank you, @aclaymore. No, I wasn't aware of that. I'll read through it this evening. Much appreciated.

1 Like

According to my perspective, binaryAST (or the JS IR you propose) and js0/jssugar are not alternatives but rather have a layered relationship, where binaryAST/IR should be based on js0. In fact, I feel that the lack of js0 might be one of the reasons why binaryAST has struggled to progress (vendors don't want to be forced to constantly update the binaryAST standard as the language evolves).

1 Like

JS0 is a bad proposal in part because it says "all the cruft so far should be part of the core forever". I much prefer the idea of taking the time to figure out what a good minimal API surface is from which all other parts of JS can be easily built up.

I've repeatedly experienced bad extensions to the language from the perspective of transpilability. The first big example was class syntax. It specifically established support for the extends Builtin syntax which can never be transpiled correctly.

This meant that everyone (but particularly library authors) needed to either wait 10 years to use the full feature set, or consider extending builtins one of "JS: the bad parts" forever.

I would still like to see something like JS0 happen because I would see it as a declaration that the committed intends never do something that damaging to the continuity of the language again

(ESM was also a huge break in continuity, and thus something I would consider as 100% needing to be part of any minimal subset of JS)

1 Like

[The] binaryAST ... should be based on js0. In fact, I feel that the lack of js0 might be one of the reasons why binaryAST has struggled to progress.

@hax - I agree. The JavaScript Binary AST Proposal is not ambitious enough, and the benefits are not great enough. Something closer to what we're discussing would be a little more technical, but would do a lot more for vendors and the Web.

JST files would be much smaller than source files (and minification would be obsolete), without requiring a decompression step. In fact, it would parse much faster, and everything would be in the right order (for example, bindings before code (so stuff like TDZ wouldn't exist)). You would also be able to replace Base64 payloads with true binary data (which knocks 25% off the cost of Base64 strings).

We would get to remove a ton of Bad Parts without breaking the Web, and can add good parts that don't belong in the JavaScript that browser's need to validate on the fly, like types.

Think about how much time TypeScript spends every day establishing the types of JavaScript values, and how much effort V8 makes to try to reestablish the same information on the fly. It's crazy to throw that information away. Even a JS transpiler (with no type annotations) can confidently establish a decent percentage of types (or knows it'll belong to some simple set (Array or undefined, for example)). Anything bound with const has a fixed type, just for starters.

JS0 is a bad proposal in part because it says "all the cruft so far should be part of the core forever". I much prefer the idea of taking the time to figure out what a good minimal API surface is from which all other parts of JS can be easily built up.

@conartist6 - Completely agree. Moving to JSTs would provide a perfect opportunity to work this stuff out. The existing proposal is really just a more sophisticated approach to minification. We need something more like WebAssembly (plausibly just a major extension to Wasm) that only includes things that transpilers need to communicate to engines.

It's imperative that any code compiled to a JST behave identically to the same code stored in plain JavaScript. It's good to have different languages that compile to JS, but the compiled JavaScript should seamlessly interoperate with anything written in JavaScript (or anything else that compiles to JavaScript). Having a lingua franca is what keeps the community coherent.

For example, we'd be able to do stuff like declare a function's locals using a LEB128 encoded list of type-indices, then reference locals by index, but anyone importing the JST would still get a regular JS function.