provide expectations with matchAll and replaceAll

The TL;DR of this story is that matchAll and replaceAll = shenanigansAll because that All suffix suggests to the developer, in the first place, that All is what she/he meant.

The current standard throws an error meaning:

  • there is room for improvements, as throwing means no backward compatibility issues (AFAIK)
  • there is a check on the RegExp flag, meaning no performance would be impacted

Around the latter point, since flags are checked already, my proposal is that in these All suffixing methods the /g flag is added out of the box, as throwing makes no sense:

  • the user asked for All occurrences of a specific RegExp pattern
  • the user needs to wrap even these simple expectations behind a try/catch that helps nobody understading, at runtime, what caused the issue

I come from this tweet/x-post that validates the current status-quo: the /g flag on these methods is a requirement mistake: x.com

I hope there will be a reasonable feedback or outcome around this topic, as I find it hard to reason about myself ... I would expect that any All in the JS platform would serve the developer intent instead of backfiring, and I don't see the reason why a RegExp without a /g flag should throw, as opposite of being re-purposed/cloned with the right source and flags that include g in there, whenever the /g flag is not there already: win-win for the past, and the future JS users :wave:

1 Like

That was the original designs of both proposals, and the committee very strongly insisted on the current behavior instead.

1 Like

any insight of what was the reason for that? it feels counter-intuitive for any junior to senior developer to me.

I believe a lot of it was desire to avoid the complexity of needing to create a spec-fictional regex for the non-g cases.

When digging, I realized that actually matchAll landed with this behavior - https://github.com/tc39/ecma262/pull/1480 - but discussion with replaceAll led to https://github.com/tc39/ecma262/pull/1716 for consistency. https://github.com/tc39/notes/blob/1f537ba9b96598b2152530cfca648530cea1c23a/meetings/2019-10/october-2.md#stringprototypereplaceall-for-stage-3 is the notes where that was decided.

3 Likes

So it’s a case of putting ease of spec editors and implementors over user needs? Is there no priority of constituencies at TC39?

1 Like

As I recall, the thinking was more like, if you're passing a non-global regex into matchAll or replaceAll, you're making a mistake, and you intended it to be a global regex. This API choice forces you to fix your mistake, which might be as simple as RegExp(re, re.flags + 'g'), or, removing "All" from the call.

I don't agree with that - i wanted matchAll and replaceAll to unconditionally be replacements for match and replace - but there was enough support on the committee for that viewpoint that I consented for the sake of progress.

1 Like

This assumes the entity writing the regex and the entity calling replaceAll() on it are the same, which is not true for regexes passed as parameters to e.g. library code, often without knowing how the library will use them and whether global matching is required (which is often an implementation detail, e.g. see the grammar definitions in PrismJS).

so ... 'aa'.replaceAll('a', 'b') produces bb without a sweat, meaning:

  • find all occurrences of the a String pattern matching and replace it with b
  • no g flag needed, the /g is implicit due API name and expectations
  • meaning ... 'a' gets converted into /a/g out of the box ... another inconsistency of the current API

Now, 'aa'.replaceAll(/a/, 'b') throws an error:

  • the intent was still explicit, it's a replaceAll task to fulfill
  • the error makes no sense, literally and grammarly speaking ... it complains that replaceAll can't replace All without a flag

Now ... the whole thing seems extremely trivial to solve on user-land side:

function replaceAll(str, re, place) {
    let prev = str;
    while ((str = str.replace(re, place)) !== prev)
        prev = str;
    return str;
}

replaceAll('aa', /a/, 'b'); // bb

hence I wonder what is the real reason for these API to actually not be used to satisfy requirements/users' expectations ... I couldn't find anything relevant in those links to justify the current state of affairs.

edit an easy way to go for the standard would be to perform my shim when the /g flag is not present, do whatever it's doing already when it is ... that doesn't change the reference or behavior of the RegExp, it actually simply repect the user's intent and expectation when calling an All suffixed method. If perfs are to blame, lets tell users g flag is preferred but not mandatory, imho.

indeed - and this API choice does indeed force such an entity to do something like .matchAll(re.global ? re : RegExp(re, re.flags + '')).

this is non-sensical and hostile for no good reason to me ... if a new RegExp is created, let it be a g one or any argument around preserving state makes no sense anymore :disappointed_relieved:

…which is not practical for use cases that do this kind of matching very frequently and/or on large amounts of text, so they now need to preserve references to the new RegExp objects so they don't recreate them each time. Lots of added complexity for no apparent reason. Error conditions are for when user intent is ambiguous. Here it is not, so creating an error condition is a UX antipattern.

2 Likes

replaceAll(regex, "b") is useless because it can be done with replace(regex, "b").
The only sense could be if to be able to use without "g".
The committee had to figure this out.

It’s not useless because it handles strings more reasonably, whereas previously a task as simple as replacing all instances of a string with another string required constructing regexes and escaping special characters.

I wrote "regex", not "string". For a string, it is useful. But then regex, there is nothing new.
It had either to work only with strings, or to provide something new for regex like possibility of omitting "g".
It is an omission of the committee.