Faster String Comparisons

Hello, I noticed that toLower() === toLower() is the common practice for comparing strings ignoring case sensitivity. I wanted to know is it possible to add a function like

String.compare(str1, str2, IgnoreLowerCaseOption);

This should in theory run faster than doing toLower() === toLower()

I already tried using String.prototype.localeCompare and it is slower than doing toLower() === toLower()

I asked this to be implemented in V8 but they told me it needs to be approved as part of ECMA specifications. Which is why I am stumbling around here.

Thanks

1 Like

I like the idea, but would prefer the case-ignoring comparison to use a different function.

  1. It’s easier to use with .sort().
  2. It’s not immediately clear what a boolean even means in that context when you’re reading it.

Also, the existing precedent with locale comparisons prefers different functions (and also puts them on the prototype).

Thanks for the feedback, I am not too attached to the interface of the function. The thing I would like is not to create new strings that are lowercase just to do a comparison between 2 strings.

Also, I was thinking IgnoreLowerCaseOption could be a more general object which has a field for ignoring case sensitivity. I am sure in the future there could be other options that may get added.

So perhaps something like
String.prototype.compare(str1, str2, { caseSensitivity: false });
or
str1.comapre(str2, { caseSensitivity: false });

Could you give some possible examples of this that wouldn't be better suited for localeCompare and localeCompareIgnoreCase (which already accept options objects, by the way)?

I was suggesting it be an object for future proofing it but as I mentioned what I care about is just to optimize the string comparison. So something like String.prototype.compareIgnoreCase(str1, str2) is fine by me.

As mentioned this is not really the same thing as localeCompare because it goes through Intl.Collator API and that causes it to be a lot slower, Even if you do something like this

const collator = new Intl.Collator(undefined, { sensitivity: 'accent' });
collator.compare(str1, str2) === 0;

Not including the creation (first statement) of collator.

collator.compare(str1, str2) is still slower than toLower() === toLower() which is not optimized because it is creating 2 new strings that are lowercase and then doing the comparison.

I would recommend looking at how other languages do this as a reference. As an example C# does this
string.Equals(str1, str2, StringComparison.OrdinalIgnoreCase)
This is faster than lower casing and comparing the strings

We already have this:

"hello".localeCompare("HELLO", undefined, { sensitivity: 'accent' }) // 0

The fact that it’s slower than using toLower just means engines haven’t put in the work to make it fast. Adding a different way of doing the same thing wouldn’t suddenly cause them to put in that work.

From my understanding toLower() === toLower() is fundamentally different than "hello".localeCompare("HELLO", undefined, { sensitivity: 'accent' })

From someone smarter than me
String.prototype.localeCompare computes equivalence, which is much
more complex and complicated than the case conversion that
String.prototype.toLowerCase performs. For example, localeCompare
considers "s\u0307\u0323" and "\u1E69" equivalent, whereas toLowerCase
does not.

…. Right, which is why you’d want to use localeCompare and not toLowerCase. In any context where you have inputs for which this matters, localeCompare is going to be more correct than toLowerCase + ===. In a context where all inputs are ASCII engines could make localeCompare just as fast or faster, if they wanted to put in the effort.

It will never be just as fast because there is a cost to checking if it is ASCII or Unicode. But I get what you mean. If you do go with that then it also creates an inconsistency with the toLowerCase(). Should toLowerCase() also handle cases like this?

Not automatically. That’s what .normalize() is for.