Extending the parser?

dmchurch · March 26, 2024, 5:05am

I'm curious if the idea of providing programmatic access to and extension of the ECMAScript parser has been discussed in the past. I've read some of the discussions of how to incorporate TypeScript-style type declarations into the standard and the difficulties therein, but it seems like the obvious answer is "just let the host use the Typescript parser", so... I assume this must have been discussed at some point?

To be clearer, I'm talking about providing a standard API for defining a filter between lexer and parser, so that the token stream could be modified and the result parsed according to standard ECMAScript rules; I imagine this would be in the vein of Proxy objects, in the sense of "user code being allowed to customize standard behavior without breaking language invariants". So, no changing what can and can't be a token, that sort of thing.

I wasn't able to find any proposals or anything along these lines, but I'm more than willing to believe that (a) I don't know the right terminology to search by, and/or (b) there's a fundamental flaw in the concept and it's been discussed and discarded.

aclaymore · March 27, 2024, 8:52am

If anything, there has been a proposal for the opposite. GitHub - tc39/proposal-binary-ast: Binary AST proposal for ECMAScript.

Parsing is an important aspect of startup time, and the desire tends towards making it more efficient.

Being able to intercept the token stream from JS at runtime would be difficult to achieve in a performant way.

dmchurch · March 27, 2024, 1:24pm

I certainly don't disagree, but it's also a bit orthogonal to the problem, imo. Large and performance-critical sites are always going to use bundlers, transpilers, etc, so there's no need for the engine to be able to do that at runtime. The use case for the JS engine being able to understand/ignore typing hints - or at least, the one that occurs to me - is in being able to natively support low-criticality or in-development code.

As an example, I just participated in the 7DRL game jam, and I chose to write my team's entry in jsdoc-annotated JS that could be directly interpreted by the browser, so as to avoid wasting any time on making sure the whole team understood the build process while still giving me the benefits of type-safety. It would have been much nicer if I could have written the code in TS from the get-go, allowing the browser to interpret it natively during development, and only added a transpilation/build step once we were done.

I'd actually consider that a variation on the theme, not an opposite. Both an AST syntax and an ignorable-typing syntax fall under the category of "providing the parser with an alternate syntax stream that is not generated by lexing/parsing standard JS source code". There's no need for a binary AST syntax to get baked into the spec if there's already a standard way to provide an alternate parser stream. Obviously a JS-coded "binary AST parser" won't be performant enough, but that doesn't matter: one, anything that can be written in JS can be written in WASM, and two, anything with a standard JS interface can be special-cased into the JS engine itself as long as there's a graceful degradation available. The beauty of the ECMAScript spec is that engines don't have to follow the algorithms to the letter, so long as all the observable effects are correctly produced.

Topic		Replies	Views
Execute typescript directly 💡 Ideas	17	931	March 10, 2022
Strict, native type checker 💡 Ideas	10	717	April 8, 2022
ECMAScript proposal: Hash Comments (replacement for Type Annotations) 💡 Ideas proposal	20	192	April 28, 2025
Draft for Hybrid Typing synthax, fixing Type Annotations proposal 🦋 Proposals proposal	3	381	March 28, 2023
Proposal: Parser Augementation with CSTML 💡 Ideas	0	282	September 2, 2024

Extending the parser?

Related topics