Сancelable/async regexp

Regular expressions have exponential time worst case complexity and its usage are blocking operations. This provides the ability of ReDoS attack.

time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCX")'
// real    0.529s

time node -e '/A(B|C+)+D/.test("ACCCCCCCCCCCCCCCCCCCCCCCCCCCCCX")'
// real    3.809s

My suggestion is to make regexp asynchronous and cancellable. This will allow us to reliably protect ourselves from such attacks in production, because we always (almost) know how complex the string is expected to be for a given matcher, and the other cases are most likely an attack and do not need to check them.

1 Like

Possible usage like this:

...
/regex/.test("string", 100) // Operations more than 100ms returns false
...

or:

/regex/.test("string", 100) // Operations more than 100ms throws error
        .catch(() => false)

maybe even:

function check(str) {
    const re = new RegExp('regex');

    return new Promise(function (resolve, reject) {
        re.onDone(resolve);
        re.onCancel(reject);

        setTimeout(function () {
            re.cancel();
        }, 100);

        re.test(str)
    });
}

I agree that with thier current semantics, RegExps are not safe to use in production. I proposed an alternative recently that got support from @jridgewell, but didn't get any traction with implementers (the Chrome and Safari team).

They are instead working on preventing backtracking when possible.

I don't think it is the safest way forward though. Either what you propose or possessive RegExps give the ultimate power to users and both options can be tested exhaustively.

OTOH, engine optimizations can fail or regress in corner cases that escape testing.

I just want to have possibility to avoid event loop blocking, besides it doesn't contradicts to yours proposal :)

Indeed, I wanted to give you what seems like relevant information on the topic

@nskulikov, this is actually already possible in javascript in both browsers and node.js.

In the browser, we have web workers. And in node.js, we have the Worker class from the node:worker_threads package. I had to implement a worker thread for running a regex test server-side for a project where I knew the regex was susceptibly sourced from user input. I just used a worker, and passed the regex and test string. I then even terminated it after 5 seconds if it didn’t complete by then (probably longer than needed, but I can always tweak it).

1 Like