Idea: compare a string against a set of regular expressions

Currently, there's a number of things that require looking up values based on their input:

  • Glob matching: Entire packages exist specifically to accelerate this matching, and so it's pretty important. (.gitignore lists would find this especially handy.)
  • Server-side routing: Express is used extremely broadly, and one of its two main features is routing. Of course, I could also go into other frameworks like Koa, but the concern's pretty universal across all of them.
  • Client-side routing: basically every client side router. Dynamic routers in particular like React Router would see considerable simplification to their internals.

This is functionality where, if designed correctly, could uniquely be massively optimized by an engine. Regular expressions can be represented as finite state machines (and V8 uses that as a fallback), and so merging them is as simple (in theory) as merging DFAs. This offers a tremendous runtime performance opportunity that is just not possible with ordinary regular expressions. But even without this hypothetical performance boost, it'd still be valuable simply as a shared common abstraction.

Rust's regex crate offers a RegexSet that can act as a potential primitive for this. What it does is essentially this: check several regular expressions concurrently and then return the offset matched. This does not include group information, which would be needed for use in routing.

I'm stopping short of proposing a solution here, but wanted to get some discussion rolling around the concern.

2 Likes