proposing RegExp.prototype.count(text, start = 0, end = undefined)

kaizhu256 · October 27, 2021, 5:48pm

the problem RegExp.p.count() solves deal primarily with converting absolute-offsets to line-numbers:

Performantly get line-numbers directly from raw v8-coverage-files:

startLine = /\n/.count(
    source,
    0,
    v8CoverageObj.startOffset
)
endLine = startLine + /\n/.count(
    source,
    v8CoverageObj.startOffset,
    v8CoverageObj.endOffset
)

preserve line-numbering when linting embedded css/js in .html files:

Array.from(htmlSource.matchAll(/<script>([\S\s])<\/script>/)).forEach(function (matchObj) {

    let lineOffset = /n/.count(htmlSource, 0, matchObj.index);

    let warnings = lint(matchObj[1]);
    warnings.forEach(function ({
        line,
        ...
    }) {
        console.error(`warning at line ${line + lineOffset} ...`);
    });
});

counting number of subdirectories in pathname

count = /\//.count(urlParsed.pathname);            // "/aa/bb/cc.html"
html += `href="${"../".repeat(count - 1)}index.html"`; // "../../index.html"

ljharb · October 27, 2021, 6:28pm

Perhaps https://github.com/tc39/proposal-regexp-match-indices addresses some of this?

kaizhu256 · October 27, 2021, 8:24pm

not sure ... how would that proposal help convert offsets/indices to line-numbers?

let sourceCode = "aa\nbb\ncc\ndd\n"
let startOffset = 4;
let rgx = /\n/d;
...
let startLine = ??? // expecting 2

claudiameadows · November 8, 2021, 5:31am

You could just re-iterate the string to count those if you need them - it's relatively straightforward, even when accounting for Windows newlines.

In practice, I only really ever want those in throwaway Node scripts and in DSL compilers, and in most of those I'm just iterating lines, code points, or raw UTF-8 bytes instead.

kaizhu256 · April 5, 2023, 11:14pm

another real-world use-case i came across:

quickly summarize number of rows in csv-file going through a pass-thru system
- intended to be stored as sqlite-blob with no parsing

let csv = 'aa,bb\ncc,dd\n'
let rowCount = (/\r\n|\r|\n/).count(csv) // Regexp.p.count proposal
...
sqlite_connection.execute(`
INSERT INTO table_raw_data
    SELECT
        $csv AS csv_blob,
        $rowCount as csv_rows,
        LENGTH(CAST($csv AS BLOB)) AS bytes;
`, {
    csv,
    rowCount
})

Topic		Replies	Views
String.prototype.count(searchString[, start[, end]]) 💡 Ideas proposal	3	71	December 20, 2024
RegExp: Comments 💡 Ideas	30	1504	November 1, 2021
Proposal for Array.prototype.count 💡 Ideas	9	488	July 30, 2021
Extend regular expressions for non-strings 💡 Ideas	2	300	October 27, 2020
Why Number.range instead of a new object like Range? I have questions	24	784	September 15, 2021

proposing RegExp.prototype.count(text, start = 0, end = undefined)

Related topics