I did some benchmarking work 2 years ago when evaluating JavaScript patterns and libraries for Structured Concurrency (SC). Though the specifics of SC is not relevant to this thread, the important part to understand is that in order to write ergonomic async code with SC properties, it’s better to have language or runtime level support for task cancellation, which is something long overdue for TC39 almost a decade ago. So it’s impossible to achive SC + implicit cancellation and clenup, using merely async/await and Promise primitives alone. Which is why we are now seeing more libraries doing their own task scheduling runtime on top of Generators. For a few examples: Effect (formerly effect-ts), Effection, ember-concurrency, co.js.
And there are projects heavily relying on Generators in order to achieve complex execution control, like doing Algebraic Effects. Notably engine262 is one of those examples.
To be honest, the ergonomics to wrtie complex concurrenct code using Generator based SC patterns felt so good that they are now my default choice for prototying. However, one decisive factor of Generator made me conclude that they are not ready for use in production, where our code runs on millions of devices on tens of different platforms, including embedded devices with tight CPU and memory budget.
That factor is of course the performance. Specifically - Generator delegation using yield* syntax. As the depth of these delegation stack grows, the time to call next() on the top of chain would grow at O(N) rate. This is because by the current spec of ECMA262, every generator on that chain must be sequentially woken up but only to call the next() function of the next generator it delegates to, and also the value yield by the inner most generator has to travel all the way up the chain, waking up every generator again, just to pass the value back to the outer iterator. As a result, a seeming O(N) iteration of a generator may actually incur O(N^2) performance cost.
This observation is not only theoretical. I actually went to the length of rewriting our code using Generators and yield*, and measured that they performed much worse than before.
To demonstrate this, I also wrote a simple benchmark for the primitives. The graph below shows the time increase related to the depth of the delegation stack.
As we can see, async/await as a baseline has the best performance of them all, because the JS engine is able to directly resume the top of the stack frame. yield* delegation chain performs way worse. It’s interesting that if I transpile the all the ES6 function* and yield* syntax using Regenerator, and run them using the Regenerator runtime, then it actually would perform better!
Python language designers realized this problem when they introduced Generator delegation, and implemented an optimization: PEP 380 – Syntax for Delegating to a Subgenerator | peps.python.org
A possible strategy is to add a slot to generator objects to hold a generator being delegated to. When a
__next__()orsend()call is made on the generator, this slot is checked first, and if it is nonempty, the generator that it references is resumed instead. If it raises StopIteration, the slot is cleared and the main generator is resumed.This would reduce the delegation overhead to a chain of C function calls involving no Python code execution.
If JavaScript can have a similar optimization defined in spec for Generator (both Generator Function and user-defined Generator), it would make this pattern become more production ready for performance demanding use cases. One possibility is to add a [Symbol.delegateIterator]: Iterator property to the Iterator class, and native Generator Functions can automatically set this property on itself when it delegates to another iterator / generator using yield*. User-defined generators can also benefit from the optimization by adopting it.
