Error "detail"

theScottyJam · April 19, 2024, 4:03am

Totally fair. Probably the most important highlights are:

@dmchurch's view of what the core problem statement is: Error "detail" - #29 by dmchurch
@mmkal's view of what the core problem statement is: Error "detail" - #38 by mmkal
One potential solution to the problem statement - Error "detail" - #31 by dmchurch - I believe us three agree that it would solve it - it's not necessarily everyone's preferred way to solve it, but it does solve it.

In particular is it meant to help catching code to make decision about the error, or for diagnostics purposes when the error is reported?

Not really either. According to both of their problem statements, it's more about being able to easily provide dynamic values within errors without having to do the busy work of formatting them (plus, as was previously mentioned, the end-user can't always format dynamic values as well as the engine can, e.g. the engine can tell you if a value is a proxy when it displays an error - the end-user can't do that).

So, as a more concrete example (and I'll use the Error.new API that's described in the potential solution mentioned earlier) - if I have an API that takes a string as a parameter, and you supply an object, I can throw an error telling you that you did it wrong like this:

throw Error.new`mySpecialFn() expected a string, but got ${badValue}`

If the error goes uncaught, the engine might render the error as "specialFn() expected a string but instead got: { x: 2, y: 3 }".

If this was implemented in userland, and userland just used String() to convert these functions, the error would instead be rendered as "specialFn() expected a string but instead got: [object Object]"

There's varients to this, e.g. I know @mmkal prefers the style from the original post where any dynamic values are supplied after the error message instead of inline. But, this should be the gist of it.

theScottyJam · April 19, 2024, 4:46am

And, thinking about it, I probably ought to duck out of the conversation now. I originally disagreed with the idea, but with the refined problem statement, I'm now good with it and would enjoy using the feature if it existed. But I also don't feel like like I have a strong want for it either, so I'll just watch the rest of this conversation from the sidelines.

mhofman · April 22, 2024, 3:52pm

My question is about the intent of how errors are consumed, not about the ergonomics of how the errors are produced.

It can be argued that structured details publicly attached to an error would facilitate logic processing of errors within the program, which IMO should be discouraged.

I'm all for adding better details to errors, but I believe these details should only be used for enhancing reporting.

Edit: providing details of template values in error message is related to this. These are often values that enhance diagnostics, but don't need to be revealed in the public message.

dmchurch · April 22, 2024, 5:52pm

My own intent (not speaking for the OP, of course) is specifically that I want to see smarter formatting in exactly the places where console.log shows up. I keep writing that two-line error-throw so that I can get useful information out of an error message either (a) when I add the check during active search for a bug, or (b) because the check represents a programming error, and the information will help me (or someone else) fix it.

100% agree. Exception handling in languages that aren't built to use exception handling as control flow (i.e. in languages that aren't Python) tends to be an expensive operation, so we wouldn't want to encourage an exceptions-as-control-flow pattern. Especially since it's already simple enough to write

throw new class extends Error {
  foo = 5;
  bar = "foo";
}("Error message")

The new class Error example above is also nicer to engines, which can parse this and then know to make an Error metaclass representation with two extra slots, while

throw new Error("Error message", {detail: {foo: 5, bar: "foo"}})

would likely always be implemented as a runtime check.

That's a big part of why I suggested the template tag static method - while the interpolations do go into the detail array, the syntax doesn't instantly suggest how to retrieve them at runtime, and the number of slots to create is guaranteed by syntax. Also, the retrieval ergonomics are purposefully non-optimal, which discourages use-in-code. If you want to use .prop syntax on the other side, be nice to the engine and use an anonymous class definition.

mmkal · April 22, 2024, 7:42pm

Minor correction, I think the throw new class expression is missing an extends and should read more like this:

throw new class extends Error {
  foo = 5;
  bar = "foo";
}("Error message")

To be honest I kind of like that trick! The main problem with it is that it's extremely weird looking, and ever using it in real life would probably be counterproductive because it seems to be so clearly fighting the language as it's designed.

I agree with everyone in not wanting to encourage code that inspects error.detail on a caught error. But there's a pretty clear equivalence with .cause which as far as I know hasn't caused trouble or controversy. It's typed as unknown in TypeScript. It doesn't need to be an Error. There has even been a suggestion on this thread to just use cause for shoving in some extra detail (I still disagree with that suggestion, if for no other reason than "cause" being a very misleading name for such usage).

@dmchurch I don't think I follow the "would likely be always be implemented as a runtime check" part of your comment - what would engines be checking? When you say nicer to engines are you talking about performance?

Also it feels a bit strange to force the wording of the error message to accommodate the interpolated parameters. Sometimes the detail just doesn't fit into a sentence? I'm picture lots of messages with a not-particularly-value-providing suffix like Error.new`Query failed. Detail: ${detail}` which to me is not as nice as new Error('Query failed', {detail}).

bergus · April 22, 2024, 8:36pm

What exactly do you mean by "logic processing"? Having conditional statements in a catch clause depend on those details? I'm not so sure what's wrong with that, would you rather recommend Error subclasses and instanceof conditions instead?

I doubt you can prevent "abuse". As soon as the details are available on the error object so that structured logging can report them, they're also available for anything else.
The only thing you could do is to discourage such usage by hiding the details behind a method like .getLoggingDetails() or .toJSON() instead of storing them as a public property of the error object.

bergus · April 22, 2024, 8:40pm

dmchurch:

throw new class extends Error {
  foo = 5;
  bar = "foo";
}("Error message")
The new class Error example above is also nicer to engines, which can parse this and then know to make an Error metaclass representation with two extra slots, while
throw new Error("Error message", {detail: {foo: 5, bar: "foo"}})
would likely always be implemented as a runtime check.

Not clear what you mean here. Also it's a pretty horrible pattern to create an entire new class constructor and prototype object every time you create an Error instance. This is definitely not "nicer to engines".

Keep it simple and write

throw Object.assign(new Error("Error message"), {
  foo: 5,
  bar: "foo",
})

dmchurch · April 22, 2024, 8:42pm

Ah yup, you're absolutely correct, I did miss the extends keyword! I've edited the example so as not to confuse anyone who's looking at it.

As for "fighting the language", I'd actually disagree there - anonymous classes aren't as popular as anonymous functions are, but they're part of the language for a reason! I'll elaborate in a moment, but I'm gonna address this point first:

This is actually part of the reasoning behind my suggestion, not an unintended side-effect - having the "primary intended interface" to the detail mechanism be a tagged template literal forces (well, "strongly encourages", at least) developers to find a way to fit the information into the error message. This goes back to the point about the primary expected interface to Error consumption being "I record the .message attribute and the stack trace, and I've recorded basically everything needed to track down the failure". If whatever host is dealing with the Error doesn't understand .detail, then having the details be interpolated into .message stands the least chance of losing important information - while, if all the important information is encoded into .detail and the .message is generic, those Error consumers will suffer a usability regression.

Assuming (which I imagine would likely be the usual case) that developers aren't specifically creating a detail object to collect error details, I'd expect the comparison to be more like the following:

const api = SomeApiOrOther.get();
if (api.hasError()) {
  throw new Error('Query failed', {detail: api.getErrorDetails()});
  throw Error.new`Query failed. Details: ${api.getErrorDetails()}`;
}

Roughly the same number of characters on a line, but the second form instructs a non-detail-aware implementation about how and where the details should get included in the error output.

I'll follow up in a moment with the explanation about the implementation details

dmchurch · April 22, 2024, 10:58pm

This is an extremely common misconception, which I'll try to clear up! It factors into what @mmkal was asking about engine performance, and it's also at the heart of my other two recent suggestions, Proposal: Parser Augmentation Mechanism and JSON.equals(x, y). The core of it is this:

ECMAScript engines are not required to implement ECMA-262 to the letter in order to comply with the specification.

If you don't understand the meaning behind this, expand the following for a deep-dive into how engines can comply with the spec without following the spec:

So what does "ECMA-262 compliance" actually mean?

I've phrased this somewhat provocatively, but the reason for that is that it's a really important to understand when trying to make changes to the spec. The ECMAScript specification describes a conceptual, theoretical ECMAScript implementation. It uses Abstract Operations (AOs) to describe what this conceptual engine does, and a naïve engine implementation might simply execute everything as described, in order. The engine might have a single JSObject class which performs all the requisite operations - searching its internal property descriptor map, delegating lookups to its prototype, etc. It would be perfectly ECMA-262 compliant on account of, it would be a literal realization of the spec into code.

It would also be extremely slow and would not be able to run the majority of modern websites.

That's why the spec doesn't mandate exactly what an engine must do at all times. All it mandates is that the observable effects of the engine have to be the same as the observable effects of the conceptual engine. For example, let's say you've got a class definition like this:

class ValueHolder {
  #value;
  get value() {return this.#value;}
  set value(v) {this.#value = v;}
}
const holder = new ValueHolder();

The spec says that, whenever you access holder.value in an expression, you call the getter function, which returns the current value of the private variable. And when you set holder.value to a new value, the spec says that you have to call the setter function, which will set the private variable to the new value.

The engine, however, can look at this and see that both accessors are the simple "expose the private variable" type, so as long as it knows that the value accessors haven't been redefined, it doesn't have to perform an actual function call (which is expensive for CPUs) and it can instead just read or write the private field directly. Because there's no observable way to tell the difference, in this case, between actually calling the accessors or just directly accessing the private field, the engine is allowed to do whichever is faster.

That requires the engine to have both (a) a certain amount of freedom of implementation, and (b) enough contextual information to be certain about what the actual outcome will be. Take the following four functions as example, expanding on the above:

class ValueHolder {
  #value;
  get value() {return this.#value;}
  set value(v) {this.#value = v;}
}
const holder = new ValueHolder();

function getValue1() {
  return holder.value;
}

function getValue2(anyHolder=holder) {
  return anyHolder.value;
}

function getValue3(anyHolder=holder) {
  const propName = "value";
  return anyHolder[propName];
}

function getValue4(anyHolder=holder, propName="value") {
  return anyHolder[propName];
}

Each of these four functions do the exact same thing when called with no arguments: they retrieve the value stored in the holder.#value private field. The conceptual ECMAScript virtual machine would have to take all the same steps: locate the value descriptor on the holder object, call the get accessor, return the value it returns. You might expect a tiny performance difference between the four functions because some of them have to check for missing arguments and populate them, but you'd expect them each to be in roughly the same ballpark. You wouldn't expect something like this:

Test case name	Result
Static object, static property	getValue1 x 31,738 ops/sec ±1.00% (65 runs sampled)
Dynamic object, static property	getValue2 x 2,666 ops/sec ±1.14% (18 runs sampled)
Dynamic object, const property	getValue3 x 2,571 ops/sec ±0.95% (62 runs sampled)
Dynamic object, dynamic property	getValue4 x 2,276 ops/sec ±0.94% (16 runs sampled)

These are the results I get from running the test on Firefox on my laptop, but you can test your own browser to see what results you get. The benchmark has a little bit of extra code to work as a proper test fixture - in particular, each "op" of the "ops/sec" is actually 100,000 calls to the associated function, so getValue4 is actually called 227,600,000 times per second.

So what's going on here? Well, the engine can look at getValue1 and be absolutely certain exactly what piece of data is getting accessed. Since holder is a const, the engine knows that it's not ever going to reference a different instantiated object, so the only check it has to make is "has this particular object had its value definition changed?" And, assuming it hasn't (which it never does), it can just return the value at one unchanging position in memory.

On the other hand, the getValue2 and getValue3 functions don't know that they'll be called with an instance of the ValueHolder class; all they know is that whatever object they get called with, they'll be accessing the value property. (In getValue3, it's obvious for the engine to see that propName can only ever be that one string, so it doesn't have to change its behavior.) In all likelihood, the engine will have internalized the "value" string at parse time; in other words, it will calculate the hash of the string, compare it with all the other internalized strings to find if something already has that hash, and then just use the shared object it finds if so. Thus, when it comes time to look up the "value" property at function-call-time, it can use the hash it already calculated at parse-time to look it up in the internal property map.

The getValue4 function, on the other hand, has no guarantees. The defaults are only defaults, and that means that there's no point in internalizing the "value" string early; if no one ever calls the function with an absent propName argument, that will have been wasted effort. So, when the function gets called, it has to do all the work, at that point in time, every time.

The TL;DR of this is that you should think about ECMAScript programs the way you think about programs written in, say, C or C++, when compiled by an optimizing compiler. It doesn't have to execute that exact code in that exact order, so long as the outputs are correct.

And, in this case, the engine doesn't have to create a new class constructor and prototype every time it executes that code. Let's look at what information the engine has, after it parses this code:

The engine now knows the following:

The thrown value will be an anonymous, nameless subclass of Error
It will have two instance fields, foo and bar
Those fields will be initialized to 5 and "foo", respectively
The Error constructor for the thrown object will be called with the value "Error message"
No other instance of this anonymous subclass will ever be created

It doesn't need to create a new class every time the code runs, because the class that gets created has the same functionality every time. In all likelihood, the class will create a "shared class" internal structure that encodes all the above information, allocating two "instance field" slots for the declared fields. When the code gets executed, it can create an instance object with those two slots, pointing to the "shared class" as its implementation.

Now, you may think "but Object.getPrototypeOf(error) and error.constructor need to return different values!" And you're right, they do... if the code ever calls or accesses these. This is why observability is important. The only way for code to observe the return value of Object.getPrototypeOf(error) is for it to call Object.getPrototypeOf(error) - and browsers know that, in modern JS code, almost nobody ever calls getPrototypeOf on an object or accesses the constructor field. Knowing that, they make the assumption that it won't be, and they'll clean up afterwards and do it properly if it turns out that was wrong. This is what's known as "fast path" vs "slow path".

So, the smart thing to do here is just leave the constructor and prototype slots blank when you create the instance (I say "blank" and not "undefined" because this isn't happening at the level of JS code; it's more likely to be a nullptr if anything) and then, if and when the code calls a method that needs to use the value stored there, then the engine can fill them in.

In contrast, when the parser encounters the following code:

It can only be certain of the following:

A value will be thrown

That's because it doesn't know, at parse time, whether Object.assign will still have the same value as the Object.assign method we're familiar with. It might assume that it probably is, but it still has to emit the code that checks that, and it has to emit the code that deals with if it isn't - and, if the cost of generating that optimized fast-path is less than the expected benefit of using it, the the engine probably just won't bother. On the other hand, when the engine has guarantees about how the code will be used, then it doesn't have to bother writing the slow path at all.

mhofman · April 22, 2024, 11:01pm

Depends where the annotations live. In our API, annotations are private and you need a global power to read them, and our reporting layers (console etc.) have access to that power so they can get error details (including stack) that just catching and holding the error doesn't give you.

So it's very much possible to prevent leaking information through errors if your errors are meant to be used to report exceptional diagnostics.

dmchurch · April 23, 2024, 12:38am

Well, instanceof tests are easier for an engine to implement, but I don't think that's what @mhofman meant (though obviously, please feel free to correct me if I'm in error). Having a large body in a catch clause is generally an anti-pattern in non-Python languages, because it often implies that you're using exceptions as part of your standard control flow. For example, the following two functions look like they do the same thing:

function searchWithBreak(iterable, predicate) {
  let foundItem = null;
  let wasFound = false;
  for (const item of iterable) {
    if (predicate(item)) {
      wasFound = true;
      foundItem = item;
      break;
    }
  }
  if (!wasFound) return false;
  // do stuff with foundItem
  return foundItem;
}

function searchWithThrow(iterable, predicate) {
  const anonymousThrowable = class {
    value;
    constructor(value) {
      this.value = value;
    }
  }
  try {
    for (const item of iterable) {
      if (predicate(item)) {
        throw new anonymousThrowable(item);
      }
    }
  } catch (e) {
    if (!(e instanceof anonymousThrowable)) throw e;
    let foundItem = e.value;
    // do stuff with foundItem
    return foundItem;
  }
  return false;
}

And, in fact, the observable effects of these two functions are as identical as I could make them^[1]:

The function iterates through iterable, calling predicate on each value
If predicate throws anything, it will be propagated, aborting the search
If predicate returns a truthy result, the function will use that value of the iterable as foundItem, and it will // do stuff with it and then return it
If predicate returns a falsy result, the search will continue with the next item in iterable
If the predicate runs out of items, the function will return false

However, searchWithThrow() is likely to execute considerably slower than searchWithBreak(), because of how long any throw takes (since it might have to pop one or more frames off the stack, which is not a quick operation).

This is an example of "using exceptions as part of standard control flow". The engine is expected to perform a throw as part of the execution of this function, and JS just isn't designed for that. Making it easier for people to write a searchWithThrow-style function means that more people will start writing that kind of function, which will be bad for the performance of the ecosystem as a whole and is what I believe @mhofman is trying to avoid.

Okay, but how bad could it be, really?

Allow me to describe how (some) C++ implementations perform exception handling. Wherever an exception gets thrown, the compiler emits a function call to a standard error-throwing function, passing it details about the thrown exception.

The error-throwing function is then responsible for unwinding the stack:

It stores the value of its return address (which, of course, is not accessible to standard C++ code, it's an implementation detail) into a temporary pointer.
It matches that pointer - that code position - against a global list of all known code positions to see (a) what function it was part of, (b) if there are any try blocks that span that code position, and (c) what variables are in scope at that position.
If there are any try blocks, it checks each of them, from inside out, to see if it has a catch clause that matches the thrown exception. It also needs to call the destructor for each variable that has gone out of scope between the site of the raised exception and the catch block.
If it has found a matching catch clause, it sets the entrypoint of that clause as its own return address (also an action not accessible to standard C++ code), then performs a return.
If it does not find a matching catch clause, it calls the destructor for all the rest of the local variables of the function.
Then it locates the return address for this function in the stack, stores it in the temporary pointer, and repeats from step 2.

It's honestly a miracle that it works as well as it does. It'd be much easier and more reliable to just push variables and try blocks onto a stack when the processor enters the scope, and then pop them off the stack when it leaves the scope. That's probably what most Python implementations do, because the Python paradigm is to use exceptions as control flow. However, that means that every try block and every variable with a destructor now has an additional startup cost in the form of pushing entries onto this stack.

Since the C++ paradigm is to only use exceptions to indicate exceptional situations, having all that startup and teardown cost being executed, over and over again, for each and every try block and variable would be wasted effort. So instead, engines choose the route that requires zero cost from the runtime when no exceptions are thrown, despite the fact that it requires the throw handler to do much, much more work.

ECMAScript's exception-usage paradigm is much more like C++'s than Python's, and so engines make the same call - low-or-zero runtime cost when no exceptions are thrown, and high runtime cost of throwing exceptions. That's why it's important not to make it too easy or attractive for developers to start using exceptions as control flow, because they'll be unwittingly writing code that causes the engine to have to work much harder.

Which does mean, of course, that engines could use the same code to implement both of them, but for the sake of this argument, we're assuming that searchWithThrow actually uses a throw-based implementation and searchWithBreak does not. ↩︎

theScottyJam · April 23, 2024, 12:54am

Guess I lied when I said I was going to stay out of this now.

But, I believe one way to satisfy @mhofman's concerns is to make the details property an internal slot instead of a public property. This would allow users to build nicer errors, and force it so the only way to view those dynamic values is by viewing the whole error, as the engine would render it. (This applies the same, whether or not we go with a details object or array + template string route)

If people have logging tools that need to be able to introspect and see those dynamic values, that can still be done, but you'd have to basically build a wrapper function around error throwing to have it capture the details and store them in an additional private place (like a weak map) for the logging tool to access later. Which sounds similar to whatever system @mhofman is already using if I understand it correctly.

bergus · April 23, 2024, 2:49am

Thanks for your elaboration @dmchurch!

Yes, I understand that very well.

Hm, technically that might be possible (after checking some preconditions, like the class having no static initialisers to run), but do you have any evidence that JS engines actually do this? This sounds like an awful amount of effort to optimise a pattern that almost nobody uses.

I've done a quick benchmark now (standard trust levels for microbenchmarks apply) and it seems that both a predeclared class and the Object.assign approach are about twice as fast as the inline class definition in V8.

dmchurch · April 23, 2024, 3:11am

Huh, good question! I'd assumed as much because that's how V8 handles anonymous functions (parse/compilation generates a SharedFunctionInfo instance, which then gets instantiated as a much-lighter-weight JSFunction when the expression is evaluated) and classes are just a kind of function, but it might very well be different for class declarations. I'm busy digging into the V8 codebase right now anyway because of the Parser Aug implementation, so I'll take a look around and update this if I find anything interesting.

Update: I'm obviously not an expert in interpreting V8 bytecode (seeing as I only started looking at this codebase a few days ago) but from what I can tell, the parse/compile step generates the following:

Class structure (list of members, base class, etc)
Initializer bytecode (the code that assigns class field initializers)
Constructor bytecode (the code for the constructor)
Outer function bytecode (what actually runs during the throw statement)

Roughly speaking, the bytecode looks similar to the following, with anonymous constants and compiler intrinsics in SHOUTY_CASE:

const ANON_CLASS;

function FIELD_INIT() {
  this.foo = 5;
  this.bar = "foo";
  return undefined;
}

function CONSTRUCTOR() {
  const rval = HAS_SUPER_CONSTRUCTOR() ? super(...arguments) : this;
  const init = arguments.callee[Symbol.[INITIALIZER]];
  if (init !== undefined)
    init.call(this);
  return rval;
}

const scope = CREATE_SCOPE(CLASS_SCOPE);
const consClosure = CREATE_CLOSURE(CONSTRUCTOR, scope, 0);
DEFINE_CLASS(ANON_CLASS, consClosure, Error);
const initClosure = CREATE_CLOSURE(FIELD_INIT, ANON_CLASS, 0);
consClosure[Symbol.[INITIALIZER]] = initClosure;
throw new consClosure("Error message");

So, looks like it does generate actual JS objects for the constructor and prototype (presumably the prototype definition is part of the boilerplate constant pool entry I've called ANON_CLASS) but the actual amount of work it's doing isn't much more than it would do when instantiating an object literal.

I was surprised to see the test I've called HAS_SUPER_CONSTRUCTOR() in there, but then I realized that it has to be in there because someone might have redefined the Error global.

Topic		Replies	Views
Error wrappers 💡 Ideas proposal	7	296	December 29, 2020
Merging stacktrace of inner exceptions 💡 Ideas proposal	4	608	October 3, 2021
NativeError: introduce a new UnknownError instead of current TypeError 💡 Ideas proposal	8	403	October 30, 2019
Ability to add custom error messages for Object.freeze() & Object.seal() TypeErrors 💡 Ideas	1	393	August 23, 2020
A standard Exception class 💡 Ideas	11	515	May 18, 2021

Error "detail"

Related Topics