Proposal - generator literals

Add a new literal type that:

  • Has an iterable value;
  • Has syntax similar to the existing array literal, including spread notation;
  • Evaluates its element sub-expressions lazily, as its value is iterated over.

The full text and a reference implementation (a Babel fork) are available here: babel-tc39-generator-literals/packages/babel-plugin-proposal-generator-literals at generator-literals Β· MaxMotovilov/babel-tc39-generator-literals Β· GitHub

The proposal has stemmed from a recent discussion on this forum and I would like to credit bergus with the original idea that evolved into this proposal.

5 Likes
  1. I love the proposal, and it'd complement the iterator helpers proposal greatly.
  2. I'd recommend enabling issues in proposal repos before posting them here - it'd be very helpful.
  3. Your syntax potentially conflicts with multiplication. (Consider: a <newline> * [ 1 ]) This could be worked around via precedence hacking, but is worth taking into consideration.
  4. Have you considered async iterables, too? Support for async iterables would trivialize things like stream concatenation and header/footer appending for one.

Done!

Your example is well formed syntactically, but not semantically, right? From a practical standpoint, the implementation should not be confused as *[ is recognized by the tokenizer (similar to e.g. a- -(b) vs a-- (b) as the latter is also syntactically but not semantically well formed). However if this is, indeed, considered a problem, an alternative syntax could be [* ... ].

Not seriously yet, as I'd rather be taking baby steps; besides I'd need quite a bit more time to implement this than I really could afford just now -- and here's a plug for my current employer, System 1, for accepting this implementation as a project for the internal hackathon :stuck_out_tongue: thus giving me a sorely needed timeslice.

With + and - it's obvious due to lhs ++ rhs invalid while (lhs + +rhs, lhs+ + rhs) valid. But with * it's little bit different story due to both lhs ** rhs and lhs * rhs is valid.

So probably same tokenization approach could by apply to differentiate a ** [...] from a * *[...]

I take it you're referring to ** exponentiation? Same argument applies to it as to multiplication. BTW a-- (b) appears to be syntactically valid -- applying postfix function call to the result of postfix decrement -- although of course senseless semantically, same as multiplying by (or exponentiating to) an array.

It's technically possible to have breaking syntactic changes (see: let destructuring), and it's worth exploring. But even if it's decided that a breaking syntax change is okay, it's still worth bringing up.

That's fair, and I wasn't saying you had to implement it now. It just plays into my previous point because async isn't a reserved keyword (though if we can break code with iterables, we can also almost certainly break code with async * [ <expr> ] as well).

There seems to be no immediate need to address with async generator literals; if the enumerated values are effectively async expressions yielding promises, those may be awaited by the caller (the original Promise.all() scenario falls loosely into this category). I can see two scenarios in which the generator has to be async:

  1. A generated values creates, by side effect, a promise that has to be awaited to produce a subsequent generated value.
  2. A spread operator applied to an async iterable.

Both cases do exhibit a need for syntactically distinct async variety of the generator literal (as you say, async *[ or possibly async [*). I can't however claim enough insight at this point to propose a complete facility.

1 Like

There's an issue with the implementation of "infinite" sequences. The Fibonacci sequence example aborts after 11418 elements with "RangeError: Maximum call stack size exceeded". There'd need to be some tail-yield-optimization so that it doesn't create and nest into additional generators.


function* fib() { yield 0; yield 1; let i = 0, j = 1; while (true) { const sum = i + j; yield sum; i = j; j = sum; } }

for (let x of fib()) { console.log(x); }

seems to work fine - it very quickly gets beyond Number.MAX_SAFE_INTEGER, of course, so i tried this bigint version:


function* fib() { yield 0n; yield 1n; let i = 0n, j = 1n; while (true) { const sum = i + j; yield sum; i = j; j = sum; } }

for (let x of fib()) { console.log(x); }

This one I stopped at 992765687369817460755688852398164251225293097743787327063498141107986853194475251553222082335349596700728572922111626994657458553746431731898215304367486253877689275873671012207444718371100523337737434268410241510593425132612923005125418611143308758434523599622932067317664102642589105475740781008537168213663747311464496387631211899426170737045984984825519947361941969073446476037455001372162199318604577707929824826889156203782700278117626778053162650802196042658190541111814267436898547588941859081959666224908924142097621531854789142838678928753515744870627126786930718440155796497865729841321480195691673366300597085721327847472213323007971504435537608822850718687358421675245934325259933744197638116016294818763429134216997494517582958444056085032805609230127999861373782751787799969045937730895988065316645168134957484737991065794908631806435336907421201661797027213755483906577241577222994187850878685195003548095821854612332422331125877185998518581726156496547604112963285350223409629172578586783104140999740507665754580197804143731150511426516110881418959052902744549279166761769307312459274475912623417616310608048990653213028874921668623941396813634190431188260974370151166128974767629982084496626446405094175265015437527802561556800734486278676082994408615057862577030370706222311343745488400588851518356263170459059472871508838612835603667834317892699619853272258333042804003471498633722529103744020610045471433249715879076227350952655104627100888445773368762676994988215278500787534903240703904589944831289227220967130233864999127569216079514782356110411973470827766578929927281385732491643571046593001425549172150836429442082835806187140722225586732977132930048276422605473968551415747335212061261618369240229302582537798468134032742555833510201699080747074234115730033646773192412690432594757906755849145463607902379891927673189904156426187068022882810271908364161535152345231107868609795892243973668668704831963592116077961823787747516581418152609625611098337570560343649459585532333454133201371066302331206565571399742517n.

When I ran let i = 0; for (let x of fib()) { console.log(i++); }, I stopped it at 1083387 items, which is much larger than 11418.

Certainly there's different ways to implement any algorithm - and some of them won't work in a given language or in given resource constraints - but since there's some ways that do work, I'm not clear on the problem.

(if i missed anything or got anything wrong here, please feel free to correct me, but i suspect the point stands anyways)

I'm talking about this babel plugin, which transforms:

let a = 0, b = 1;
const fib = *[a+=b, b+=a, ...fib];

into a recursive generator:

let a = 0, b = 1;
const fib = {[Symbol.iterator]: function*() {
  yield a += b;
  yield b += a;
  yield * fib;
}}

Of course if you make that non-recursive, then it can actually be infinite.
Fibonacci sequence is a placeholder here, the recursive implementation cannot generate an endless sequence of constants, or random numbers:

let ones = *[1, ...ones];
let rand = *[Math.random(), ...rand];
1 Like

Sounds like it’d be a good issue to file on the Babel plugin?

The plugin was intended to be a sample implementation of generator literals, mostly for demonstration purposes. The issue in question is best solved by a generic TCO mechanism; with GitHub - tc39/proposal-ptc-syntax: Discussion and specification for an explicit syntactic opt-in for Tail Calls. in place, *[a+=b, b+=a, ...continue fib] should work.

Thank you, I wasn't aware of that proposal. However, TCO alone is not enough for infinite generators. TCO squashes the next() call stack, but doesn't eliminate intermediary iterator objects whose next method is called. As I wrote earlier, you'd need a tail yield optimization, that would allow reusing not just the stack frame, but also the iterator object.

That particular proposal is currently inactive - some browsers refuse to implement it. However, I think TCO is an important part of this proposal. A generator literal could potentially be made to be a special case, where browsers are required to do TCO (and whatever extra generator-related optimizations are needed to make TCO work properly for generators) - hopefully, there wouldn't be too much friction in that.

Historically - here I mean the brief and inglorious history of my proposal :wink: - the immediate problem generator literals were intended to solve didn't involve recursion. The Fibonacci example was just something that came to mind as an interesting possibility. I do agree that "tail yield optimization" is separate from TCO and is worth addressing; probably in the wider context of all generator functions. However even the plain TCO -- well understood and widely used in other languages -- appears to make very slow inroads in Javascript :unamused:

Although, upon further reflection, it appears possible to optimize the very specific case of

const lit = *[ ... , ...lit]

in Babel because there's an unambigous guarantee both lits refer to the same entity. My temptation to also optimize var lit = and let lit = is tempered by:

let fib = *[a+=b, b+=a, ...fib];
fib = // something entirely different, like []

in which case the unoptimized (i.e. present day) code will not behave identically with the optimized version due to difference in binding times.

I don't have a whole lot of time on my hands for this exercise, having recently changed the employers, but perhaps it'll prove easy.

@lightmare - thanks for bringing the matter up!

That's how I got here, from the thread concerning Promise.all :)
I just thought the recursive generator expression was pretty cool, and got curious whether it could be made unlimited by finite call stack and memory.

Here's my attempt at flattening the generator's recursive tail: https://github.com/lightmare/babel/blob/865e6838e5f8afffa124d0c836c3e62799c4c3a5/packages/babel-plugin-proposal-generator-literals/src/index.js (it's based off your branch, only rebased on current main). I split the elements into head, tail (contrary to conventional meaning of these terms, head is a list and tail is 1 element). Elements in the head still recurse into new iterators, but the tail is simply assigned to the current iterator.

With that examples like this don't exceed the call stack:

On another note: in case you're going to proceed polishing this proposal, may I suggest rebranding it "generator expressions"? I suppose you named them literals because the spec calls [a] an ArrayLiteral, but I think *[a] is semantically closer to e.g. FunctionExpression, because it creates a closure without evaluating a.

First - wow! I have just updated main and generator-literals branches in my repo to the latest trunk, could you submit a PR from your fork? Will try and review/merge before Sunday ends :smiley:

On another note: in case you're going to proceed polishing this proposal, may I suggest rebranding it "generator expressions"?

It's a great question and likely a good suggestion; my problem is I don't really know how it will work out. My understanding of a TC39 process is that a proposal requires a champion to get past stage 0 and I'm not really in position to be one. Aside from not being any sort of a member in the standards org, I also expect to have very little to do with Javascript for the foreseeable future due to my recent career move away from FE and towards infra with its C++, Python and what have you. I'd frankly be more than happy to pass this particular torch...

Reading through your code -- it appears you have implemented a dynamic TYO as opposed to a static one: looking at whether the tail is a generator literal/expression at runtime as opposed to compilation. At first sight it may prove advantageous in being able to handle more cases -- one gen-lit referring to another, for one -- but the potential unintended consequences are a little harder to analyze. Once I have your PR, I will try to contrive some tests to break it :laughing:

1 Like

Absolutely, go ahead! :D