Strong brand checking in JavaScript

lightmare · June 30, 2021, 9:43am

But the way you suggested above with [[Brand]] and [[Kind]] is a one-way street; if the brand doesn't match, then you have no idea, it could be a false negative. I'd say that's different from how the term "brand check" is used in e.g. @bakkot's example above with Map.prototype.get.call, or private fields proposals, or the spec for that matter (IsPromise); which all allow for subclass instances to pass the check.

rdking · June 30, 2021, 1:16pm

Yeah, my suggestion didn't account for inheritance. But that's easy to fix as well. Instead of stamping [[Kind]] with just [[Brand]] from new.target, they can make [[Kind]] an array and push in the brand of every constructor that it passes through. I think this would be closer to what they want. Object.kind would then return the list of [[Brand]]s.

Using new would create a brand list, initially populated with the [[Brand]] of new.target. Each subsequent call (implicit or explicit) to super() or Reflect.construct(...) would add the [[Brand]] of the target constructor to that list. When the instance object is actually created at the bottom of the super() call chain, the instance object's [[Kind]] will be assigned to that list. Unless a class extends null, the [[Brand]] for Object will be the last entry added to the brand list at the time the instance is created.

theScottyJam · June 30, 2021, 2:54pm

I think I'm still missing something here. So, ok, brand checking will let you know with certainty if an object was created from a particular constructor, and if it has the required private and internal fields. Something that instanceof does not do.

So, what's the use case? When would you actually call a function like Map.isMap or MyType.isMyType()? What scenarios do you actually need something more robust than instanceof? Would you call it within every method on your instance so you can throw a nice error if the this value isn't what you expected it to be? That sounds tedious, potentially overly defensive, already possible, and it only gives a use case for checking userland brands. When else?

And with that, why are we requiring end-users to do brand-checking to throw pretty errors if they receive objects that are constructed incorrectly, instead of trying to add roadblocks to prevent the object from being constructed incorrectly in the first place?

ljharb · June 30, 2021, 3:21pm

instanceof is fundamentally unreliable. It’s easily forged or broken (via Symbol.hasInstance) and for builtins, it doesn’t work cross-realm.

lightmare · June 30, 2021, 3:41pm

I was under the impression this was mainly concerning builtins. Hasn't this already been solved for userland?

Classes with private fields can brand check internally, and if it's needed outside, expose a static method. Perhaps for convenience it could be added to Object, so that instead of writing that static check in every class, you would do MyType.matches(obj) (i.e. Object.matches.call(MyType, obj)).

For classes without private fields, you can do a brand check with the help of WeakSet where you put each instance you construct.

What am I missing?

theScottyJam · June 30, 2021, 3:46pm

If cross-realm was the primary concern, then I can certainly understand that.

As for forging, I still don't see a strong argument there. I can, for example, replace Array.isArray() with any function I want, and still forge things. So, if "forging" is a primary concern, then Array.isArray() fails to be helpful when it comes to brand checking, as would any Something.isSomething() (unless the Something was frozen).

> Array.isArray = () => true
> Array.isArray(2)
true

ljharb · June 30, 2021, 3:55pm

Right - in other words, what's needed is a generic brand checking mechanism that works for builtins (cross-realm, in an unforgeable way) and that userland objects can participate in (so they don't need to make a custom private field or do weak collection shenanigans).

In the meantime, builtins require a bunch of code to brand check properly (see all the "is-" packages in https://npmjs.com/~inspect-js) and userland objects are forced to do exactly what you describe.

ljharb · June 30, 2021, 4:01pm

Like all JavaScript, first-run code that wants to be robust can, and must, cache (or lock down, a la SES) all the functions it needs - including Array.isArray.

theScottyJam · June 30, 2021, 4:15pm

So, the "no forging" side of brand-checking is only useful if:

You have control of the start of execution (i.e. you're not creating a library)
You're using libraries that you don't trust to be well-written (that's fair)
You're willing to pre-cache all the functions you need from Javascript's built-in libraries (including individual instance methods), before loading third-party libraries - if you only cache some, then you're leaving big holes in the defense you're trying to build. This sounds like a lot of work, do people actually do this??

If one of these don't apply to your project, then there's really no difference in using instanceof over brand checking, except for cross-realm concerns?

(I'm also not sure what the SES acronym is)

Edit:

wait ... if the non-tamperable feature is mainly intended for those who are willing to go through the verbose work of extracting all the built-in library functions they want, to prevent tampering, then can't they also just extract the Symbol.hasInstance property with everything else? And use that instead of instanceof? These people are obviously not scared of things being a little extra dirty. They would have to do the same thing anyways for other symbols, e.g. if they want to ensure an array's Symbol.iterator hasn't been tampered with, they would need to extract it, and use .call() every time they want to iterate over an array.

ljharb · June 30, 2021, 5:40pm

Re point 1: It's useful with libraries as well - you can't defend against code that ran before you, but you can defend against code that runs after you. All of my npm packages are robust in this way.

Re point 2: almost every single application uses third-party code, and anything that's a website has to deal with the possibility of browser extensions, even if you have no ad code or remotely-loaded code.

Re point 3: yes, this is a lot of work; it is not ergonomic; most people do not do this, but i certainly do (and thus, a significant percentage of npm downloads do) - and the capability is critically important for code that wants to be robust against code that runs after it.

SES is https://www.npmjs.com/package/ses.

If you set aside cross-realm concerns, and concerns about later-running code mutating builtins or objects you care about, then sure, you don't need to worry about brand checks - but that's setting aside a very large chunk of the language, as well as reality, to get to that point.

theScottyJam · June 30, 2021, 8:44pm

I'm not necessarily trying to say that we should use instanceof instead of brand checking. I'm just trying to get a good definition of the problems we're solving here. cross-realm concerns is one problem we're trying to solve - check - so we can set that one aside as we focus on the other problems we claim brand-checking should solve.

This is the one we're trying to nail down now. Does, for example, Array.isArray() provide any additional benefits tamper-proofing-wise over instanceof?

Say we're you, building a library that's supposed to be tamper-proofed against any code that runs afterward. And, to provide as much tamper-proofing as possible, we're extracting everything we need from the built-in library up-front, that we're going to use in our program.

Here's our "instanceof" version (we're just using the underlying symbol that powers instanceof though, so that we can provide tamper-proofing)

const { Array, Error } = globalThis
const isArray = Array[Symbol.hasInstance]
const iterArray = Array.prototype[Symbol.iterator]

function sum(array) {
  if (!isArray.call(Array, array)) throw new Error('Param must be an array')
  let result = 0
  for (const number of iterArray.call(array)) {
    result += number
  }
  return result
}

Now, we're going to update this code to use Array.isArray() instead, for the purpose of providing better tamper-proofing.

const { Array, Error } = globalThis
const isArray = Array.isArray
const iterArray = Array.prototype[Symbol.iterator]

function sum(array) {
  if (!isArray(array)) throw new Error('Param must be an array')
  let result = 0
  for (const number of iterArray.call(array)) {
    result += number
  }
  return result
}

As far as I can tell, this didn't actually add any additional safety over the Symbol.hasInstance solution, and it didn't really improve the readability of the code very much. This must mean one of three things:

Array.isArray() does not provide additional tamper-proofing over existing solutions, and is thus not a valid solution for this branding proposal. We'll need to come up with something new to replace Array.isArray().
Tamper-proofing is not actually a required objective for this proposal.
There's something I'm not understanding here, about the definition of "tamper-proof" or something.

ljharb · June 30, 2021, 8:57pm

Yes. const { isArray } = Array means that as long as Array.isArray was correct when that line of code ran, isArray() can never ever lie, no matter what anyone does.

x instanceof y calls into a Symbol.hasInstance method on y, which means that unless y has been frozen in advance, it is impossible to robustly evaluate that expression.

In other words, I think the issue is the third bullet point, since the first two are incorrect.

theScottyJam · June 30, 2021, 9:22pm

Though, in my first code example, I didn't actually use "instanceof", I extracted the Symbol.hasInstance function and used that instead, to prevent the type-checking function from being dirtied later on. Similar to how I was also extracting the Symbol.iterator function to prevent the iterator function from being tampered with. I agree that there's no way to protect instanceof from tampering, just like there's no way to protect against tampering when using a straight for-of loop on an array, unless, in either scenario, you've extracted out the underlying protocol function in advance and used those instead. If we're protecting as much as we can from tampering anyways, then it should be common practice to be extracting functions using well-known symbols, and I see no reason why that can't include Symbol.hasInstance.

Sorry, shouldn't have called that code snippet the "instanceof" version if I wasn't actually using instanceof.

ljharb · June 30, 2021, 10:03pm

You're correct that that would work fine! However, String.prototype[Symbol.hasInstance] for example does not exist - if the builtin constructors had that defined, and if it did a brand check, it'd work great! but since instanceof has not historically worked cross-realm, a Symbol.hasInstance method could not, and thus, no such solution exists.

theScottyJam · June 30, 2021, 10:24pm

Alright, glad we're getting to the bottom of this :) - I feel like I always learn so much in these kinds of deep discussions.

So, in the string case, wouldn't a simple typeof myStr === 'string' work just fine? Or are you talking about type-checking String objects (like new String())? Are we wanting this brand checking to also supersede typeof, so people wouldn't need to use that anymore (due to its myriad number of issues) - that would be great is so.

theScottyJam · July 1, 2021, 7:31pm

Anyways, you can correct me if I'm wrong @ljharb, but I believe something like this sums up the idea?

Problem we're solving: We need a brand-checking mechanism that, unlike instanceof, is capable of working across realms.

Requirements:

It must be possible to prevent tampering, to the same extent that it is currently possible to do with Symbol.hasInstance or Array.isArray.

Other potential features:
(These are features that are nice to have if possible, but don't necessarily solve the main problem)

It would be nice if it only recognized objects as instances of a particular class if they were created through that class's constructor (as opposed to simply having the class's prototype).

ok - with a framework like that in mind, it's easier to make judgments about things like where userland brand-checking comes into play. The answer seems to be that brand-checking would largely be irrelevant to userland code (unless you're making a polyfill) because of the following:

As previously discussed, it doesn't make sense for an instance in one realm to be counted as part of a particular userland brand from another realm, because the two realms may have different versions of the userland library loaded. Browser APIs always provide the same API to all realms. This means userland brand checking falls out of the problem we're trying to solve.
Someone worrying about a userland class being tampered with can simply use code like the following Object.getPrototypeOf(obj) === MyClass.prototype to do an instanceof check. For the majority of people who aren't going through the verbose lengths of writing tamperproof code, they can just use instanceof. This means userland brand checking doesn't need the tamper-proofing requirement either, as they already have it.

On another note, it seems that writing a Javascript library that pre-caches everything in advance to ensure any other bad libraries loaded with it doesn't cause your library to break is a lot of work to go through, without benefiting the end user very much. The end-user will still be using many other libraries that don't have such tamper-proofing in place, their own code won't either, and the tamper-proofing mechanism is only able to fix problems that happen after the particular library has loaded, not before. Plus, it really shouldn't be the concern of one library if other libraries are mis-behaving. I'm wondering if there's a better solution to all of this, without trying to add individual protection to each proposal like we are currently doing with this proposal. For example, that SES thing looked promising - if library authors got into the habit of declaring that their library was SES compatible (i.e. it wouldn't break if ran in this secure Javascript environment), then a corporation could use SES in their project to put each library into a secure compartment, and only let developers install libraries that claimed to be SES compatible. I guess this isn't anything TC39 can do, it sounds more like something npm would have to get involved with, to make this "SES" advertising easier.

theScottyJam · July 5, 2021, 1:12am

A recent conversation on this (mostly unrelated) thread has left me wondering how brand checking should work with boxed primitives. Should String.isString(new String('myString')) return true? I would argue no - that would go against a lot of what people are wanting with brand checking. It would be inconsistent to care so much about whether or not a particular object was constructed correctly, but also let misbehaving boxed string instances through the brand check.

But, I'm realizing that this isn't something that has been brought up yet, so I thought it could be good to open up this conversation here.

ljharb · July 5, 2021, 1:56am

No, it wouldn’t - brand checking in the case of strings is checking the internal slot, which is on all boxed string objects. You have typeof otherwise.

aclaymore · July 5, 2021, 8:53pm

Are we imagining the userland aspect would also work cross realm? I can see a universe where it doesn’t due to userland being limited to private fields instead of hidden slots anyway.

ljharb · July 5, 2021, 10:40pm

No, only builtins can robustly work cross-realm.

Topic		Replies	Views
Runtime types (and many other functional ideas) 💡 Ideas	10	651	October 27, 2021
why don't string classes have symbol.toStringTag? I have questions	3	300	October 13, 2022
enum typeof 💡 Ideas	11	542	November 20, 2022
New well-known symbol: Symbol.classType 💡 Ideas proposal	18	596	July 22, 2020
Symbol registration and weakmap keying I have questions	11	67	August 9, 2024

Strong brand checking in JavaScript

Related topics