ASI: Why is the empty statement exception needed?

ASI is not performed if the inserted semicolon would be parsed as:

  1. one of the two semicolons in the header of a ForStatement
  2. an empty statement

I can sort of see why we need (1) - in a ForStatement header, the semicolon is more of a separator than a statement terminator, so we shouldn't interpret any intervening line breaks between the header expressions as statement-terminating line breaks.

But I'm not sure why (2) is needed. Say, we had a for-loop without a body:

for(let i = 0; i < 9; i++) // SyntaxError - expected expression, got end of script

I was expecting ASI to kick in here but it doesn't because of exception (2).

Why can't the above be interpreted as a ForStatement whose Statement is an unterminated EmptyStatement? Under this interpretation, ASI should kick in and terminate it:

for(let i = 0; i < 9; i++); 
//		                  ^ ASI terminating the empty statement

It could, in principle, but it would mask bugs. For example, if (condition) would be a complete program, when you almost certainly meant to put something in the body there.

3 Likes

Thank you @bakkot. Also, if you happen to know of a thread that talks about this, please reference it if possible. I couldn't find anything on git or esdiscuss.

This behavior predates the use of github or esdiscuss by quite some time - it's present in the very first published edition of the spec.

If you really want to dig into the history, Allen Wirfs-Brock has a paper documenting the history of the language, which talks some about ASI and which has extensive references. You can also poke around Ecma's archives - the relevant year would be 1997. At a quick glance, I see that the revision history in e.g. Version 0.9 of the spec mentions (in D.4.3) that on February 21, 1997, the draft specification was revised "to incorporate the rule that a semicolon is not inserted if it would be treated as an empty statement", but no mention of where that rule comes from.

I suspect there simply isn't documentation of the historical reason for this rule. You could always @ Brendan Eich on twitter, on the off chance he was present for and remembers the details of this discussion from twenty years ago.

2 Likes

Oh, I didn't realise it went that far back. All the polarisation around ASI made it seem like it was a newer addition. Thanks for the links.

Lol that cracked me up. Your word is as good as his for me, I was just curious because I couldn't find anything.

There are nonobvious cases where ASI on empty statement would accidentally apply (or not apply). For example, the following code:

if (false)
    const bar = 1;

which is a syntax error (because the if-branch accepts statements, not declarations), would be silently “corrected” by ASI as:

if (false);
const bar = 1;

Conversely, consider the subtle difference between:

while (do_it_until_false()) // <--- implicit semicolon inserted here
const foo = 1;

and:

while (do_it_until_false())  // <--- no semicolon inserted here
bar();
const foo = 1;

I can’t speak of the original motivation, but my educated guess is that the ratio risk/benefit of such a “feature” was deemed too high.

(There are other ways to tweak the grammar in order to avoid such confusions (example), but disabling ASI on empty statements is certainly the simplest and most robust one.)

1 Like

Thank you for the examples @claudepache.

I understand we should refrain from ASI in the first example since it's most likely a bug, as @bakkot pointed out, so silently "fixing" it would go against the programmer's intent.

But it wasn't as clear to me in the case of loops, because they can be self-contained, e.g.

for (var i = 1; i < 11; i++) // SyntaxError
const foo = 1;
console.log(i); // 11

The loop above doesn't need a body, so I didn't see anything wrong with ASI kicking in here. But again, this likely isn't what the programmer intended. And, in your second example, I can see how inserting a semicolon in while(...) \n const foo = 1; but not in while(...) \n bar(); could cause confusion and potentially even end up creating a "bug" in the former case (if the programmer was expecting const foo = 1 to be parsed as the loop body). Exception (2) clears all this up.

I was confused by exception (2) because it reduces the predictability of ASI. But ASI is all about inferring the programmer's intent, which isn't the most predictable thing to begin with. Even so, it's amazing to see how carefully the authors have designed it to account for all these different possibilities.

Just FYI: I mostly code in ASI style, and I always use explicit {} for empty loop bodies when I use them.