Feature Request: Quoting Expressions

Hello. I would like to propose a new keyword, quote(), for quoting language expressions into abstract syntax trees or expression trees. This language feature appears to require a standard abstract syntax tree or expression tree model.

The feature resembles:

// returns an AST of the expression
quote(1+1);
// returns an AST which captures utilized variables
quote(data[x] + data[y]);
// works with lambda expressions, captures utilized variables
quote((x, y) => x + y + z);

I hope to find supporters and champions for the proposed language feature. I am looking forward to discussing these ideas with you. Thank you.

2 Likes

Interesting idea. What are some use cases for this feature?

A first use case, for discussion, involves enhancing the inspectability and explainability of computer programs.

The following examples explore the scenario where a developer implements a guard section for a function, to check parameters before proceeding to other computation. Below, the variables r1, r2, and r3 are not simply Booleans, but are objects with two properties: value, which is a simple Boolean, and explanation, which could be or utilize an abstract syntax tree or expression tree.

function foo(x, y)
{
  var r1 = reasoning.assert(1+1, 2);
  var r2 = reasoning.assert(x.length, 3);
  var r3 = reasoning.assert(y, 1);

  if(r1.value && r2.value && r3.value)
  {
    // ...
  }

  return reasoning.and(r1, r2, r3);
}

In the above example, the result of the expression x.length is provided to the assert function, and the expression itself is unavailable.

With a quote() keyword, one could enhance the inspectability of the results of these assertions, as the following sketch intends to show:

function foo(x, y)
{
  var r1 = reasoning.assert(1+1, 2, quote(1+1), quote(2));
  var r2 = reasoning.assert(x.length, 3, quote(x.length), quote(3));
  var r3 = reasoning.assert(y, 1, quote(y), quote(1));

  if(r1.value && r2.value && r3.value)
  {
    // ...
  }

  return reasoning.and(r1, r2, r3);
}

If one desired to utilize the runtime evaluation of abstract syntax trees and expression trees, one could design the above scenario to, instead, more succinctly resemble:

function foo(x, y)
{
  var r1 = reasoning.assert(quote(1+1), quote(2));
  var r2 = reasoning.assert(quote(x.length), quote(3));
  var r3 = reasoning.assert(quote(y), quote(1));

  if(r1.value && r2.value && r3.value)
  {
    // ...
  }

  return reasoning.and(r1, r2, r3);
}

So, if one or more assertions should fail in the called function, foo, a calling function could inspect its result's explanation property, utilizing abstract syntax trees or expression trees, to process why. A developer could also print and view a string from such a tree to examine a situation, e.g., while debugging.

This first use case is a general-purpose one which sketches, and hopefully illustrates, how a quote() keyword could be used to enhance the inspectability and explainability of computer programs.

There are also other use cases to consider, some interesting ones involving lambda expressions.

What do you think of this first use case?

2 Likes

I see,, a feature from Julia. But that needs a lot of new syntax, new language semantics and the most tricky thing of all working with the JS AST. Without Julia style macros, I guess this feature won't going to be very useful. I don't know maybe we can bridge that with decorators.

If you haven’t already seen it, you might be interested in https://www.sweetjs.org/

2 Likes

@jithujoshyjy, thank you. I will take a closer look at the Julia programming language, its metaprogramming, and its macros. I will also take another look at the decorators proposal.

@aclaymore, thank you. I hadn’t seen Sweet.js before.

Yes, the JS AST would be an important component and would, in theory, include an API for manually creating expression trees.

That is, with an Expression global object with static methods,

var r1 = reasoning.assert(quote(1+1), quote(2));
var r2 = reasoning.assert(quote(x.length), quote(3));
var r3 = reasoning.assert(quote(y), quote(1));

could be described as being semantically equivalent to something resembling:

var r1 = reasoning.assert(Expression.add(Expression.constant(1), Expression.constant(1)), Expression.constant(2));
var r2 = reasoning.assert(Expression.getMember(Expression.constant(x), 'length'), Expression.constant(3));
var r3 = reasoning.assert(Expression.constant(y), Expression.constant(1));

The latter syntactic form is more cumbersome, more error-prone, and less readable.

Also, as envisioned, expected usage would include building and compiling lambda expressions, and invoking compiled functions. For example:

var two = quote( (x, y) => x + y ).compile()(1, 1);

I like this idea, though there's a lot of edge cases and nuances you need to be careful of - JS isn't exactly a Lisp where you can just blindly represent code as S-expressions, and even Clojure's syntax isn't exactly simple (and it's among the simplest).

I started developing a TypeScript/JavaScript expression tree project, including with an evaluate(...) method on the Expressions. I made some solid progress on it including implementing invokable LambdaExpressions.

I am presently working on a toString() method for the various types of Expressions (I might use the astring project by transforming the Expressions to estree). Generating JavaScript for Expressions could be useful in one approach to compiling some lambda expressions. Another approach for compiling some lambda expressions might involve emitting and dynamically loading WebAssembly.

I found one of the edge cases and nuances of JavaScript that @claudiameadows might have been referring to: there is no goto statement in JavaScript but there are labels for nestable loops which work with break and continue statements.

Update: That scenario is implemented. The expression tree project is coming along nicely.

I am finding that an evaluate(context?: EvaluationContext): any method on an Expression base class adds some complexity as the project sketch moves to include all of JavaScript's expressiveness.

With such an evaluate method on Expression, end-users could simply:

var x_plus_1 = quote(x + 1).evaluate();

or

var x_plus_1 = quote(x + 1).evaluate(new EvaluationContext());

I wonder whether that would be a desired feature.

Exploring the topic, I have implemented evaluation for simple expressions, conditional logic expressions, expression blocks, various types of loops (including labeled loops and related break and continue scenarios), and lambda expressions.

Challenges for implementing expression trees, in particular when individual Expressions can be evaluated, include transforming expression trees for: (1) closures, (2) generator functions, and (3) async/await functionality.

It appears to be the case that implementations which support evaluating arbitrary Expressions would tend to want to copy and transform copies of expression trees for runtime evaluation when the Expressions involve generators or async/await functionality.

For those interested in these topics, there are projects, e.g., traceur and regenerator, for transforming source code and abstract syntax trees.

Thinking more about a quote keyword, abstract syntax trees, and the visitor pattern, a default Expression tree model could support the visitor pattern so that that one could call:

var myNode = quote(x + 1).accept(myVisitor);

to obtain an abstract syntax tree from their preferred model.

Also possible is that, using something like:

Expression.setDefaultVisitor(myVisitor);

syntax like:

quote(x + 1)

could automatically route its output through a specified default expression tree visitor without developers having to call accept on each use of the quote syntax.

A quote language feature could, then, as indicated above, route its content through a specified default expression tree visitor to provide output from arbitrary models, e.g., estree.

In summary, it is possible that developers could use a language feature like quote to get content in the abstract syntax tree model of their choice.

My worry is that any code depending on syntax tree logic could break whenever the syntax of Javascript gets updated. It just seems like a fragile thing to depend on, but maybe I'm wrong here.

There is binary AST proposal (further reading about its design).
It would codify a sufficiently future-proof AST structure.

@lightmare, thank you for sharing the binary AST proposal.

@theScottyJam, you raise a good point, in particular as JavaScript is improved each year.

In these regards, an analogue is C# and .NET . When LINQ and the System.Linq.Expressions.* namespace debuted, they were widely celebrated and remain useful to developers. As the C# language has evolved, the System.Linq.Expressions.* expression tree model has remained relatively still (see, for example: this project).

I think that developers would tend to want all of the language features of each edition of JavaScript to be quoteable. Some consumers of the accompanying abstract syntax tree / expression tree model, however, advanced users, would tend to need to update their software as the JavaScript language and model evolved. Perhaps there is a solution, e.g., with settings and configuration.

Something like quote() and an accompanying abstract syntax tree / expression tree model would convenience many programming, metaprogramming, and optimization scenarios, e.g., facilitating runtime code generation with reactive observables and queryables.

Also, interestingly, on the topic of TypeScript, one can envision typed quoting. That is, developers might desire for something like the following to work:

function foo(lambda: LambdaExpression<(x:number) => number>)
{
   // ...
   let compiled: (x:number) => number = lambda.compile();
   // ...
}

foo(quote((x:number) => x + 1));
foo(quote(function (x: number) { return x + 1; }));

though providing typed quoting seems to preclude providing developers with the capability to set a default expression tree visitor.

As we've been discussing these interesting topics, I've been developing some rapid prototypes.

Here is a sketch (from a working prototype) of a LambdaExpression<T> type which can represent arrow functions, anonymous functions, functions, and generator and async flavors thereof.

import * as astring from 'astring';

//...

export class LambdaExpression<T extends Function> extends Expression
{
    constructor(
        parameters: readonly ParameterExpression[],
        body: BlockExpression,
        id?: string,
        options?: {
            arrow?: boolean,
            expression?: boolean,
            generator?: boolean,
            async?: boolean
        }
    )
    {
        super('lambda');
        this.#parameters = parameters;
        this.#body = body;
        this.#id = id;
        this.#arrow = getOption(options, 'arrow', false);
        this.#expression = getOption(options, 'expression', false);
        this.#generator = getOption(options, 'generator', false);
        this.#async = getOption(options, 'async', false);
        this.#compiled = null;
    }

    #parameters: readonly ParameterExpression[];
    get parameters(): readonly ParameterExpression[]
    {
        return this.#parameters;
    }

    #body: BlockExpression;
    get body(): BlockExpression
    {
        return this.#body;
    }

    #id: string;
    get id(): string
    {
        return this.#id;
    }

    #arrow: boolean;
    get arrow(): boolean
    {
        return this.#arrow;
    }

    #expression: boolean;
    get expression(): boolean
    {
        return this.#expression;
    }

    #generator: boolean;
    get generator(): boolean
    {
        return this.#generator;
    }

    #async: boolean;
    get async(): boolean
    {
        return this.#async;
    }

    override accept<R, C>(visitor: Visitor<R, C>, context?: C): R
    {
        if (visitor == null) throw new Error("Visitor is null.");
        return visitor.visitLambda<T>(this, context);
    }

    #compiled: T;

    compile(): T
    {
        if (this.#compiled == null)
        {
            var __value__ = null;
            var visitor = new ToESTreeVisitor();
            var context = new ToESTreeVisitorContext('__captures__');

            var wrap = Expression.lambda([context.capturesParameter], Expression.block([Expression.return(this)]), null);
            var assign = Expression.assign(Expression.variable('__value__'), wrap);

            var node = assign.accept(visitor, context);
            var code = astring.generate(node);

            eval(code);

            this.#compiled = __value__(context.captures);
        }

        return this.#compiled;
    }
}

To compile these lambda expressions, the prototype evaluates source code generated by the astring library from estree abstract syntax trees produced by a special-purpose tree visitor.

The prototype supports captures, e.g., the variable obj_y in:

let obj_y = { value: 123 };
foo(quote(function (x: number) { return x + obj_y.value; }));

via a type of Expression, CaptureExpression, which is processed by the aforementioned special-purpose tree visitor.

Clarifying, for:

let obj_y = { value: 123 };

let x = Expression.parameter('x');
let y = Expression.capture(obj_y);
let y_value = Expression.member(y, 'value');
let add = Expression.addition(x, y_value);
let ret = Expression.return(add);
let lambda = Expression.lambda<(x: number) => number>([x], Expression.block([ret]), 'foo');

let compiled = lambda.compile();

the compile() method on LambdaExpression<T> generates the following source code for the eval() function:

__value__ = function (__captures__) {
  return function foo(x) {
    return x + __captures__.variable_0.value;
  };
}

while simultaneously assigning captured runtime objects, e.g., obj_y, to the properties, e.g., variable_0, of the captures-related data structure.

Hopefully, the compiling of lambda expressions would, eventually, be further optimized, e.g., with built-in or native implementations.

@theScottyJam, brainstorming with respect to ideas for syntax tree logic API given that JavaScript syntax is updateable, some ideas from the document object model come to mind.

In these regards, there are DOMImplementation::hasFeature() and Node::isSupported(). These methods were designed to return a Boolean value indicating whether an indicated feature, optionally at an indicated version level, were provided, available, or supported.

Based on that reasoning, we could provide something like:

Expression.hasFeature(feature: string, version?: string): boolean;

and, with something like that, developers could query whether a feature, e.g., expression type, existed in the implementation.

For example, imagining that for-of-loop's were a new language feature:

Expression.hasFeature('forOfLoop');

However, a quote from the DOM specification: "hasFeature() originally would report whether the user agent claimed to support a given DOM feature, but experience proved it was not nearly as reliable or granular as simply checking whether the desired objects, attributes, or methods existed."

That is, developers could, instead, do something like:

if(Expression.forOfLoop)
{
   // ...
}
if(typeof Expression.forOfLoop === 'function')
{
   //...
}

So, brainstorming, we could provide API for granular feature detection and it is also possible for developers to otherwise detect whether types of expressions exist in an implementation's abstract syntax tree model or expression tree model.

1 Like

I like this idea; it might help bridge the gap between decorators and macro; i.e if it could produce the AST in some form like JSON.
I mean consider this:

const greet = word => console.log("hello ", word)

const quotted = quote(greet("world"))
/* returns
// just an over simplification I assume
{
    type: "FunctionCall",
    head: {
        type: "Identifier",
        value: "greet"
    },
    tail: [
        {
            type: "StringLiteral",
            value: "world"
        }
    ]
}
*/

const greetings = @sayGoodbyeInstead quotted
// returns "goodbye world"
/* this does something like
    if(quotted.type == "FunctionCall")
        console.log("goodbye ", quotted.tail[0].eval())
*/

@jithujoshyjy, thank you. I think that you are onto something with these ideas about transforming abstract syntax trees / expression trees. In addition to potential uses of decorators, the visitor pattern could be useful for processing and transforming these trees.

let output = quote(x + 1).accept(visitor);

Whether utilizing decorators or the visitor pattern, processing and transforming quote-obtained trees would be very useful. It could be useful, for instance, for some aspect-oriented programming scenarios.

let obj_y = { value: 123 };
let output = quote(function (x: number) {return x + obj_y.value;}).accept(visitor);
let compiled = output.compile();
1 Like

A first use case, for discussion, involves enhancing the inspectability and explainability of computer programs.

Is that really a use case? If yes, why don't engines provide such functionality already through their development tools? ASTs tend to be quite large compared to source code, so I doubt that anyone would prefer to read ASTs instead of source code. And even if someone would need this, why would this be needed at runtime? Why can't a snippet by copied into e.g. AST Explorer (that's my current workflow, in case im unsure about operator precedence) ?

Concerning the .accept and .compile thoughts: do you really need to rewrite ASTs at runtime?
Isn't that something which existing tools like Babel / ESLint already cover?

@Jonas_Wilms, thank you. That first use case isn't the best use case for quote(). I had been looking at code contracts and thinking about static checking and runtime checking scenarios:

contract.requires(x != null);
contract.requires(quote(x != null));

as well as thinking about Error objects which could, in theory, utilize abstract syntax trees / expression trees to explain to developers the nature of runtime errors (beyond or in addition to developer-specified string-based error messages).

Other, perhaps more interesting, use cases include uses of lambda expressions with libraries for iterables and observerables. One could utilize quote() (or whatever it might eventually syntactically resemble) to facilitate the various optimizations possible with IQueryable in .NET (see also: LINQ). One could, similarly, utilize it to optimize scenarios utilizing reactive libraries, e.g., RxJS and https://github.com/dotnet/reactive .

For instance, if we look at the RxJS filter operator

filter<T>(predicate: (value: T, index: number) => boolean, thisArg?: any): MonoTypeOperatorFunction<T>

we could create an operator which, instead of receiving a function, receives a lambda expression tree as an input argument:

filter<T>(predicate: LambdaExpression<(value: T, index: number) => boolean>, thisArg?: any): MonoTypeOperatorFunction<T>

Then, as operators are combined upon iterables or observables, runtime optimizers, transpilers, and other tools could inspect the abstract syntax trees or expression trees.

Developers could utilize quote() to conveniently process inline functions, anonymous functions, arrow functions, and so forth, into these LambdaExpressions.

Concerning the .accept and .compile thoughts: do you really need to rewrite ASTs at runtime?
Isn't that something which existing tools like Babel / ESLint already cover?

I think that these capabilities would be useful. Above are but a few concrete examples of what is possible with metaprogramming and runtime code generation.

as well as thinking about Error objects which could, in theory, utilize abstract syntax trees / expression trees to explain to developers the nature of runtime errors

I doubt that (having worked with a lot of ASTs in the last few months). Again: Programmers are used to read / write source code, ASTs tend to be quite large and, well, "abstract". Could you share a concrete example were an AST would be useful for debugging?

Other, perhaps more interesting, use cases include uses of lambda expressions with libraries for iterables and observerables. One could utilize quote() (or whatever it might eventually syntactically resemble) to facilitate the various optimizations possible with

One can already use Function.prototype.toString to get the sourcecode of a function object and then parse that with a parser of your choice. I doubt though that this kind of reflection in JS is necessary. I've seen multiple C# to JS rewrites already were fn(obj => obj.property) was trivially replaced with fn(obj, "property") (using dynamic property access instead of reflection). Here again I wonder what concrete usecases you'd have in mind, which can't already be solved through other means.

Also for what it's worth, "runtime code generation" and "runtime optimizers" work on a level way below ASTs (and that's definetly out of scope of JS).