`Set.prototype.tryAdd`

I find myself doing a lot of code like this:

if (!foo.has(value)) {
  foo.add(value)
  // do interesting thing
}

An internal userscript at work I've been writing has several of these for deduplicating actions, for one example.

It's a common enough idiom that Webkit even added optimizations for patterns like it (search for map.has). You could also see it as not unlike the Set analogue to GitHub - tc39/proposal-upsert: ECMAScript Proposal, specs, and reference implementation for Map.prototype.upsert.

My concrete suggestion here is that this method be added:

Set.prototype.tryAdd = function (item) {
  if (this.has(item)) return false
  this.add(item)
  return true
}

WeakSet.prototype.tryAdd = function (item) {
  if (this.has(item)) return false
  this.add(item)
  return true
}
1 Like

I agree Set.prototype is an exquisitely bad API. I always shed a tear when I read

Returns true if value was already in mySet; otherwise false.

and then realize I'm looking at delete, not add. Who thought it'd be a great idea to make add ignore all arguments but the first, and return the set to allow chaining. It's ridiculous.

However I'm not a fan of adding another function with such a narrow use case.

This is how add should've been:

Set.prototype.addMany = function(...args) {
  return this.addFrom(args);
}
Set.prototype.addFrom = function(arrayLike) {
  const oldSize = this.size;
  Array.from(arrayLike).forEach(x => this.add(x));
  return this.size - oldSize;
}

addMany with a single argument would be substitute for tryAdd.

The one advantage "tryAdd" has over "addMany", is "tryAdd" reads much better when put in an if().

if (mySet.tryAdd(element)) {
  // When do you enter this if? If tryAdd succeeds.
  // It's common for try* functions to return a success boolean
  // ...
}

if (mySet.addMany(element)) {
  // When do we enter the if? dunno - check addMany()'s documentation to see what it returns.
  // Maybe it returns the number of successful elements that were added.
  // or maybe it returns the new length of mySet
  // or something else.
  // ...
}

On a related note.

It's generally considered good practice to name arguments when their intention isn't clear (i.e. instead of designing a function that's called like this: createUser('sally', true), have it be called like this: createUser('sally', { withTemporaryPassword: true }). I wish API authors and language designers would extend this principle to return values. i.e. instead of designing a function like this if (mySet.delete('x')), we instead designed it like this: if (mySet.delete(x).suceeded). No shame in returning objects with single properties, if it makes code more readable. This fictional addMany() could return an object with a "successCount" property, and that would put this fictional API on par with tryAdd() in my eyes.

1 Like

Let's say you come across this piece of code:

if (mySet.add(element)) { ... }

Do you go check the documentation for add, or do you know whoever wrote that didn't read the doc?

I guess it depends on why I'm reading that piece of code. If I'm just skimming through trying to understand what the code does, I would probably just assume that mySet.add(element) would return true if successful, based on knowledge I have on how similar APIs get designed. I didn't realized it returned this until this conversation started.

If I'm porting that Javascript code to another language, or if there's a bug around that area, then maybe I'll be a little more careful and double check the docs.

Updated snippet because the intent was to do it for both standard and weak sets.

Yeah, that's what I was trying to get at. Unless the code is suspected broken, you go with an interpretation that makes sense, even if you're not sure about the details. You would assume it returned true/false based on prior knowledge and gut feeling — because it wouldn't make much sense if it returned the new size, or the object itself, both of which you can access easily, right? ;) And if you're writing that code, you'll learn it with use.

Of course, addMany would cause friction with the existing add. But same goes for tryAdd. When there's add and tryAdd in an API, one might expect add to throw, instead of silently ignoring duplicates.

Yeah, I can understand that base point. However...

I only understood that based on context from previous APIs that have been implemented in similar ways. If I created a new language, and showed you a code snippet from it that looked like this "for (i = 0; i < users.getLength(); i += 1)" you would understand what that means in a heartbeet, because we've all had to learn how these C-style for loops work. But, that doesn't mean it's good design. If you stop to think about it, the syntax for that for loop is actually quite odd - thank goodness many newer languages are simply not including a for-loop that looks like that.

Similarly, many previous APIs like to simply return booleans or numbers without context, but that doesn't mean it was a great design decision. I could only understand what if (mySet.add(element)) would do based on context I've had with past experience with other array/set/map API's I've worked with. Newer programmers don't have this kind of context, so knowing what that "if" will do is tricker for them.

What's more, something like if(set.multiAdd(x, y)) would be a lot more ambiguous to me. I would expect multiAdd() to return a boolean, based on the context I see it in, and based on other APIs I've worked with, I would expect it to be a boolean indicating success status. So, what if "x" was successful and "y" was not? Would it be true or false? I wouldn't know, until I look at the docs and saw that it in fact returns a number. Something like if(set.multiAdd(x, y).successCount) would have been much clearer to me.

Edit: With all of this said, I was mostly talking about a general principle of self-documenting what return values are. I think this is more important for some APIs than others - The general idea of if(set.multiAdd(x, y)) is still somewhat understandable, even if the specifics are not. Something like return createUser('sally') would be much worse if that return value was a temporary password that was assigned to the new user. No one would guess that in a million years - It would be much better if a little more context were provided, like this return createUser('sally').newTemporaryPassword.

This is a very good point.

tryAdd is just a name - I'm more focused on functionality, and would like to focus on gaining interest for that first before bikeshedding the name.

Seconded! Moreover, it seems to me that this proposal GitHub - tc39/proposal-upsert: ECMAScript Proposal, specs, and reference implementation for Map.pr a little overwhelming. It would be enough to have a symmetrically simple emplace in Set and Map:

if (map.emplace(key, value)) {
  ...
}

instead:

if (!map.has(key, value)) {
  map.set(key, value)
  ...
}

and

if (set.emplace(value)) {
  ...
}

instead:

if (!set.has(value)) {
  set.add(value)
  ...
}

The map emplace bit is necessarily more complicated as it's inserting not just a key, but also potentially updating a value. Sets only have keys, so they don't need any of the complexity related to values.

Arguably, Sets only have values :-p (see Set.prototype.keys.name) but both mental models can hold.

1 Like

I'm aware. :stuck_out_tongue_winking_eye:

Well, at least it could have been easier. For example:
inserting:

if (map.emplace(key, value)) {
  ...
}

updating:

map.emplace(key, undefined, oldValue => {
  ...
});

map.emplace(key, undefined); // without 3rd param equivalent to map.set(key, undefined)

inserting + updating:

if (map.emplace(key, value, oldValue => {
  ...
})) {
  ...
}

In this case we have a more simpler upsert proposal and at the same time a fully synchronised "emplace" method with Set.

That isn't useful when the initial value is, say, pulled from a different cache somewhere, though at that point I'm not sure how much perf is to be gained by lazily computing it.

Also, most uses of map.emplace I would strongly prefer it to return the value itself, not just existence - its utility is greatest when used as a cache primitive. I'll go ahead and file an issue about that against that proposal as I have my own opinions in that area.

1 Like

Okay, was about to file an issue citing simplicity and some mild performance advantage, but instead I have an update: one of the point of emplace is to also support the equivalent of Java's map.computeIfPresent, which is to do nothing if the entry doesn't exist, and to update the value if it does. And as undefined is a valid value (not sure why Java failed to account for this), I don't see your suggestion getting far.

Edit: Also this FAQ question also explains why it's not split like you would otherwise think would be ideal based on intuition.

On reflection, I agree that using this option may not be the best idea. On the other hand we need to think about how often do we need insertion and updating at the same time? This seems to me to be a very rare case. In that case it might make sense to separate these two operations into "Map.p.emplace" and "Map.p.update":
insert:

if (map.emplace(key, value)) { // doesn't support 3rd arg anymore
  ...
}

update with optionally check for existing:

map.update(key, oldValue => {
  if (typeof oldValue == "undefined") { // or !map.has(key)
     return "insert new value";
  }
  return "update value";
})

wdyt?

That's covered indirectly with examples with these two FAQ questions in the proposal:

Still, those examples describe slightly different alternatives. For example this:

let updated = counts.emplace(key, (existing) => existing + 1);
if (!updated) {
  counts.set(key, 0);
}
let n = counts.get(key);
if (n > RETRIES) {
  // ...
}

one could imagine what it would be like:

const n = counts.update(key, existing => {
  if (typeof existing == 'undefined') return 0;
  return existing + 1;
});
if (n > RETRIES) {
  // ...
}

I suggested similar, but ultimately closed it as it doesn't fit with either the standard library idioms (why I avoided undefined) or the general Map implementation model.