Data manipulation augmentation: string suffix/prefix trimming

JS being one of the most dominant programming languages on planet, has no buit-in functionality for trimming prefix & suffix from a strings, and people have to implement their own complicated, slow, bug prone implementations, ranging from using dynamic regex, to messy string manipulation approaches, only to do a basic trim operation, in fact it is surprising that no one has tried to add this functionality to the language before, maybe they did but no one really bothered to go through a four-staged proposal to request this basic feature.

Unless there's already a highly optimized, clean and widely accepted approach which has minimal complexity, Javascript lacks crucially when it comes to string manipulation (particularly trimming to be specific), which is surprising rather than disappointing, considering it dominated the entire web that is full of textual data.

please consider this situation:
let Data = "000Data000"

The first thing that comes to mind after finding that there is a trim() function provided by language is to use it like this:

let Data = "000Data000"

trimmedData = Data.trim("0")
// expected result: "Data000"
// actuall result: "000Data000"

only to find out trim removes whitespaces, and only whitespaces, this is true for , trimStart(), and , trimEnd(), too

this is a commonly encountered scenario where we need to remove a specific sequence of symbols from beginning(prefix) or end(suffix) of a string.

Of course we can use capabilities of language itself to resolve this and get our very much desired "Data000".


let Data = "000Data000";

let regex = new RegExp(`^0+`);
let trimmedData = Data.replace(regex, '');

console.log(trimmedData);
// "Data000"

and a more dynamic version:

let Data = "000ddd000";

let prefixToTrim = "0";

// Dynamically create the regular expression

let regex = new RegExp(`^${prefixToTrim}+`);

let trimmedData = Data.replace(regex, '');

console.log(trimmedData);

// "ddd000"

We can't ignore that there is a lack of a proper and elegant way of doing so.

There are the trimStart and trimEnd methods.

What do you mean by "suffix/prefix" in particular? An arbitrary string? A pattern, such as a repeating character?

Thanks for pointing out my ambiguity in this post, I've provided additional details and examples.
Regarding your comment: yes I've searched the whole internet for a standard way of trimming, the trimming functions were the first results I've checked out, but as JS documents describe, trim(), trimStart() and trimEnd() will only remove whitespaces from a given string and return a formatted new string without leading and/or trailing whitespace(s)

Really? Data.replace(/^0*/, "") is not complicated, slow or messy.

For me regex replace is clearer than your Data.trim("0"). I don't understand why that would only remove zeros from the start.

What would Data.trim("01") do? Remove zeros and ones, or sequences of "01" ? With a regex it's obvious: Data.replace(/^[01]*/, "") or Data.replace(/^(01)*/, "").

2 Likes

I understand the language provides the means to do a trim and there are various ways we can do it, one would be regex (which is effective but of course harms code readability to some extent)
However I doubt you believe there is no need for a better standard way, should everyone really use their own creative implementation of the same thing (in our case trimming) ?

if I were to accept you point of view, I wonder why we needed the trim function in the first place, it literally only removes the whitespaces from a string, wouldn't it be better to use you're approach to remove whitespaces too ?,
in either case, it really eats me alive: "why the developers who implemented the trimming functions, would not think of improving it to remove other characters too ?"

--EDIT--
To answer your question: yes, a trim function by definition should remove the leading and/or trailing Unicode point, from any given cutset(string)

There are many programming languages that already implement the trimming functionality, the ones I worked with myself would be Golang and Python,

--EDIT--

I agree, using Data.replace(/^0*/, "") won't be much different than using Data.trim("0"), but you're ignoring the fact that this whole scenario, is also a very simplified version of problems out there in real world, ask yourself: which syntax would I prefer if I face a complicated version of this problem ?

I would post my specific problem on stackoverflow but this post has another purpose, which is discussing about possible improvements of JS.

I do. I don't remember ever using Python's str.strip for anything but whitespace, even though it accepts optional list of character to remove.

Maybe because removing whitespace is way more common than, say, removing leading zeros. Or maybe because removing from both the start and end with a regex is not trivial. I don't know, it's been there since ES5.

1 Like

Indeed, removing whitespaces from beginning and end of a string is way more common than removing specific characters.
I hope, we can at least agree on this fact: JS needs this functionality to be fully supported or fully removed(which is not an option for JS in my opinion).

One must either implement something or not implement something, an application should either support some functionality, or not support some functionality, there is no middle ground,
there is no: "This feature is only functional to this extend, and beyond it you have to implement everything yourself", especially if there is already a clear definition of the said functionality out there, and many other peers are supporting it.

Implementing the trimming functions and not fully supporting the functionality expected from them is wrong, considering JS has been there for so long and has seen many refactors and improvements since ES6, and still, no one really bothering to fix this issue all this time...

I agree it's pretty weird that .trim() (and .trimStart()/etc variants) solely remove whitespace in JS, without the option to remove other characters. The Python equivalents default to removing whitespace, if called with no arguments, but can take an (unordered) sequence of characters to remove instead. (aka, the default is basically equivalent to s.strip(" \t\n")). I think it would be useful to pursue adding an optional argument to the functions that did this, matching Python behavior.

Note that this is different from removing a prefix/sufix from a string. For example, in Python, "foobar".lstrip("fo") yields "bar", not "obar", because it's removing the characters "f" and "o", not the prefix "fo". Doing that is easy, tho, with .slice() - "foobar".slice("fo".length) yield "obar" as expected, and you can do the same thing with a negation to remove a suffix.

(One bit of confusion you keep running into is your insistence that your modified .trim() would only remove the characters from the start of the string, for some reason. .trim() removes whitespaces from both ends; if we added an argument, it should obviously remove the characters from both ends. .trimStart()/etc exist if you want to remove them from only one side.)

2 Likes

That's where I'm getting lost. I have no idea what a complicated version of this problem looks like -- part of it may be that I'm a regex-aficionado; but really, what's a complicated problem solvable with trim/strip?

That's putting the cart before the horse. The problem has to be identified and described first. And I'm pretty sure that in this particular case, which is fully implementable as a library, such library will have to exist and prove itself worthy, before anything goes into the standard. To be perfectly clear, I have no say in that, just guessing ;).