REF: Trim not removing zero-width spaces · Issue #182 · jprichardson/string.js · GitHub
The issue explains why a zero-width-space isn't included in the whitespace characters that trim
removes.
So is there any discussion around the issue?
/\s/.test('\u200b'); // => false
'\u200b'.trim().length; // => 1
As mentioned in Whitespace character - Wikipedia, I have tested with the category of "Related Unicode characters property White_Space=no
" with the following snippet:
[
'\u180e', '\u200b', '\u200c', '\u200d', '\u2060', '\ufeff'
].filter(ch => ch.trim().length);
// => ['\u180e', '\u200b', '\u200c', '\u200d', '\u2060]
I found that '\ufeff' is a special case.
I'm not sure what discussion there would be? trim
removes whitespace on the ends. /\s/.test()
returns true
for things that are considered whitespace.
You're welcome to try to convince Unicode to add the zero-width space to the whitespace category, but it wouldn't be appropriate for JS to violate this preexisting axiom.
So I want to know what the Unicode specification is according to in JavaScript? It seems the special case \uFEFF
has been violated before according to Unicode Utilities: UnicodeSet.
Oh, I see it. \uFEFF
has been treated as Byte Order Mark (a.k.a BOM)