Should String.prototype.trim remove zero-width space? (\u200b)

REF: Trim not removing zero-width spaces · Issue #182 · jprichardson/string.js · GitHub

The issue explains why a zero-width-space isn't included in the whitespace characters that trim removes.

So is there any discussion around the issue?

/\s/.test('\u200b'); // => false
'\u200b'.trim().length; // => 1

As mentioned in Whitespace character - Wikipedia, I have tested with the category of "Related Unicode characters property White_Space=no" with the following snippet:

[
    '\u180e', '\u200b', '\u200c', '\u200d', '\u2060', '\ufeff'
].filter(ch => ch.trim().length);
// => ['\u180e', '\u200b', '\u200c', '\u200d', '\u2060]

I found that '\ufeff' is a special case.

I'm not sure what discussion there would be? trim removes whitespace on the ends. /\s/.test() returns true for things that are considered whitespace.

You're welcome to try to convince Unicode to add the zero-width space to the whitespace category, but it wouldn't be appropriate for JS to violate this preexisting axiom.

1 Like

So I want to know what the Unicode specification is according to in JavaScript? It seems the special case \uFEFF has been violated before according to Unicode Utilities: UnicodeSet.

Oh, I see it. \uFEFF has been treated as Byte Order Mark (a.k.a BOM)