JSON.traverse()

This proposal suggests the addition of a JSON.traverse method to the ECMAScript specification. This method would provide a more efficient and direct way to traverse and potentially modify a JSON structure without the need for string conversions. It aims to improve performance and streamline operations that currently require both JSON.stringify and JSON.parse.

Motivation

Currently, to traverse and modify a JSON structure, developers often use a combination of JSON.stringify and JSON.parse with replacer and reviver functions. This approach, while functional, has significant drawbacks:

  1. Performance Overhead: Converting a JSON object to a string and then back to an object incurs a performance penalty. These operations are computationally expensive and can slow down applications, especially with large or deeply nested JSON structures.
  2. Inefficiency: The dual conversion process is inherently inefficient. The need to serialize and then immediately deserialize data is redundant and can be avoided with a more direct approach.
  3. Complexity: Using both JSON.stringify and JSON.parse together to achieve traversal and modification increases code complexity and potential for errors. A single method that can handle both operations would simplify the code and reduce the likelihood of bugs.

Proposal

The proposed JSON.traverse method would traverse a JSON structure, applying the provided replacer and reviver (optional) functions directly to the objects and arrays within the structure. This method would eliminate the need for intermediate string conversion, providing a more efficient and straightforward mechanism for processing JSON data.

Method signature:

JSON.traverse(value, replacer, reviver)
  • value: The JSON structure (object or array) to be traversed.
  • replacer: A function that can transform the values in the JSON structure during traversal. This function is called for each key-value pair in the object or each element in the array.
  • reviver: A function that can transform the values after they have been replaced. This function is called for each key-value pair in the object or each element in the array after the replacer function has been applied.

Example:

const data = {
  a: 1,
  b: 'hello',
  c: {
    d: 3,
    e: 'world'
  }
};

// Replacer function
function replacer(key, value) {
  const parentValue = this; // also possible to access to the parent

  if (typeof value === 'number') {
    return value * 2;
  }
  return value;
}

const result = JSON.traverse(data, replacer);
console.log(result);
// Output: { a: 2, b: 'hello', c: { d: 6, e: 'world' } }

Benefits

  • Improved Performance: By avoiding the need to serialize and deserialize JSON data, JSON.traverse would significantly reduce the computational overhead, leading to better performance.
  • Simplified Code: Developers would be able to achieve their desired transformations with a single method call, reducing the complexity of the code and minimizing the potential for errors.
  • Direct Manipulation: The method allows for direct manipulation of the JSON structure, making the process more intuitive and efficient.
1 Like

A clear benefit would be in many compiler plugins that make modifications to the AST, instead of using libraries to traverse the JSON they could use this native method.

I would find this very useful.

It would be very helpful to be able to define some state that gets passed along when making the traversal, and have clear, codified semantics along the lines of "you will enter an item then exit it before traversing the next item". Consider

[
  { kind: 'User', name: 'Joe', data: someUserAuxData },
  { kind: 'Email', email: 'joe@joe.com', data: someEmailAuxData },
]

Being able to know whether we are in an object of type User or Email is extremely helpful when processing data. You might, for example, want to emit an error that says "Error validating data on User Joe". (In this case, I am imagining that replacer gets called with { kind: 'User', ... } and at some point later gets called with someEmailAuxData.)

As an example of this kind of need, consider this required_directive transform in Relay, which keeps track of the current path, errors, etc as it traverses.

Alternatively, this traversal state can be held in a separate object. If that's sufficient (and maybe more performant), that's fine too, but the docs should show this pattern.

1 Like

Another useful need would be to short circuit and avoid traversing a given subtree, but I don't know see how one would do that. (Throw a specific error?) Might be worth thinking about that need.

You cannot, for example, use this with a circular data structure :/, but there should be no reason to disallow that.

1 Like

You need to fix your terminology first, it's very confusing when you say JSON structure. JSON is a string.

At first I thought you meant JSON.traverse() would parse JSON, somehow modify it on the fly without constructing the whole thing, and return JSON again. But what you describe has nothing to do with JSON at all. You want to traverse an object, and return an object. Why not Object.traverse(), then?

That's just bad, lazy code, with drawbacks well deserved. I would argue that traversing an object with (possibly recursive) functions to modify certain parts, is not that hard to write, and is easier to understand, than one big replacer function with a bunch of conditions guessing where in the hierarchy it's been called. You can carry state, know how deep you are, skip whole parts, handle cycles as you wish, and apply different transforms to different parts of the object.

Could you provide an example? What I have worked with (babel, recast) had AST traversal/modification functionality built on the visitor pattern, also with facilities like state or "don't recurse into this". That cannot be replicated with what you propose.

3 Likes

You are right. I put JSON more for its familiarity to "stringify" and "parse" as I explained, but here neither the input nor the output is JSON. I'm going to change it to Object.traverse, I also like it better and it makes more sense. Thanks for your feedback.

An example would be using meriyah to parse the code to AST, and then astring to generate the AST code.

To do the traverse, you can look for more libraries like astray, but then you realize that even though JSON.stringify traverses everything and does things like transformations, being part of the platform it is probably optimized with C++ and runs faster than astray.

Example traversing the AST with Bun:

  • JSON.stringify: 0.03ms
  • astray lib: 0.10ms

So this is what made me think that if instead of JSON.stringify there is something more native (like Object.traverse) just to do the traverse, many people could benefit from just having the engines optimize it by implementing it with C/C++.

Both points are great things to consider.

Initially, I focused on replicating a bit of what is already there for JSON.stringify and JSON.parse with the replacer and reviver, but taking out the transformations. However, thinking about more benefits to make a good traverse, it would be nice to take into account the short-circuit and maintain a state.

Now I am going to make a change to the proposal to change JSON.traverse to Object.traverse, since here there is neither input nor output that is JSON and it makes more sense inside Object. For these two things that you have commented it will be important to consider them, any suggestion will be welcome. I do not update it at the moment but add them as pending issues to think about how to solve them.

Thanks for your feedback

1 Like

It seems that I can no longer edit the proposal. So these changes to the proposal I am writing them below:

  • Replace JSON.traverse() -> Object.traverse(): Neither the input nor output is JSON, but Object, it makes more sense for it to be inside Object.

Things to think about how to define it (any suggestions are welcome):

  • State: To be able to save a state in the part of the tree you are in during the traverse.
  • Short-circuit: Allow it to not make the full traverse in some cases to be able to exit earlier.
1 Like

I agree this doesn't make sense on JSON, it’s orthogonal to the serialization method used. Ultimately the need is about traversing objects. Though, having written a lot of code doing this generically, the devil is in the details, especially when you take into account the entirety of JS types.

What values are considered leaves and what values are considered data structures to be traversed? You probably don't want to traverse every object, but consider instances of certain classes leaves (e.g. Date). OTOH you also can't say that only plain objects and arrays are the data structures. Iterables would be a good candidate, but plain object literals are not iterable.

3 Likes