proposing RegExp.prototype.count(text, start = 0, end = undefined)

reviving an old es-disscuss topic.

the problem RegExp.p.count() solves deal primarily with converting absolute-offsets to line-numbers:

  1. Performantly get line-numbers directly from raw v8-coverage-files:
startLine = /\n/.count(
    source,
    0,
    v8CoverageObj.startOffset
)
endLine = startLine + /\n/.count(
    source,
    v8CoverageObj.startOffset,
    v8CoverageObj.endOffset
)
  1. preserve line-numbering when linting embedded css/js in .html files:
Array.from(htmlSource.matchAll(/<script>([\S\s])<\/script>/)).forEach(function (matchObj) {

    let lineOffset = /n/.count(htmlSource, 0, matchObj.index);

    let warnings = lint(matchObj[1]);
    warnings.forEach(function ({
        line,
        ...
    }) {
        console.error(`warning at line ${line + lineOffset} ...`);
    });
});
  1. counting number of subdirectories in pathname
count = /\//.count(urlParsed.pathname);            // "/aa/bb/cc.html"
html += `href="${"../".repeat(count - 1)}index.html"`; // "../../index.html"

Perhaps https://github.com/tc39/proposal-regexp-match-indices addresses some of this?

not sure ... how would that proposal help convert offsets/indices to line-numbers?

let sourceCode = "aa\nbb\ncc\ndd\n"
let startOffset = 4;
let rgx = /\n/d;
...
let startLine = ??? // expecting 2

You could just re-iterate the string to count those if you need them - it's relatively straightforward, even when accounting for Windows newlines.

In practice, I only really ever want those in throwaway Node scripts and in DSL compilers, and in most of those I'm just iterating lines, code points, or raw UTF-8 bytes instead.

another real-world use-case i came across:

  • quickly summarize number of rows in csv-file going through a pass-thru system
    • intended to be stored as sqlite-blob with no parsing
let csv = 'aa,bb\ncc,dd\n'
let rowCount = (/\r\n|\r|\n/).count(csv) // Regexp.p.count proposal
...
sqlite_connection.execute(`
INSERT INTO table_raw_data
    SELECT
        $csv AS csv_blob,
        $rowCount as csv_rows,
        LENGTH(CAST($csv AS BLOB)) AS bytes;
`, {
    csv,
    rowCount
})