"o" flag for regex to allow overlapped results

"caratraany".match(/\wa\w/og) // "car", "rat", "raa", "aan"

1 Like

What's the use case? What would be the benefit of the "o" flag over doing this with existing facilities?

A) shifting match start:

pattern = /\wa\w/g
input = "caratraany"
while (m = pattern.exec(input)) {
  console.log("match:", m)
  pattern.lastIndex = m.index + 1
}
match: [ 'car', index: 0, input: 'caratraany', groups: undefined ]
match: [ 'rat', index: 2, input: 'caratraany', groups: undefined ]
match: [ 'raa', index: 5, input: 'caratraany', groups: undefined ]
match: [ 'aan', index: 6, input: 'caratraany', groups: undefined ]

B) capturing look-ahead assertion:

pattern = /(?=(\wa\w))./g
input = "caratraany"
for (const m of input.matchAll(pattern)) {
  console.log("match:", m)
}
match: [ 'c', 'car', index: 0, input: 'caratraany', groups: undefined ]
match: [ 'r', 'rat', index: 2, input: 'caratraany', groups: undefined ]
match: [ 'r', 'raa', index: 5, input: 'caratraany', groups: undefined ]
match: [ 'a', 'aan', index: 6, input: 'caratraany', groups: undefined ]
1 Like

It is not that. The result should be just ["car", "rat", "raa", "aan"]. You have to add much code to join it so. And with "o", it could be in 2 formats depending on "g".
And your B) is a workaround trick what is not very good.
The benefit is very short and clear syntax. Many new features may be polyfilled. And what, don’t add anything that may be polyfilled?
Regex engines should progress. For some reason only JavaScript regex stands still.
For example, startsWith() was added despite many existing easy ways to do what it does.

New features are being added to Regex in JavaScript. The ES2024 spec added support for the v flag:

1 Like

Ok, it is good. I just google what new features and don't find anything. But it is only Unicode, not a new search ability for regular text.

The first one is not a good one because it's not aware of surrogate pairs. Real Unicode regexes may advance lastIndex by 2 if the last index was a surrogate pair. (It's fine here because the pattern us not Unicode, but it doesn't generalize.)

The second one is fine but it changes both the regex and the shape of the match and can only be called a workaround, not a proper solution.