not sure what you mean there, but the project is published on npm and there are easy examples to try ... my idea around branding is that if we could brand anything that results into any payload, not just primitives, my proposal would be almost accepted, except there would need a different context value at the reviver part, but because everything throws erros to date, maybe there is an opportunity there?
JSON.rawJSON() addresses one thing - serialization. If you have a live numeric value that, due to Number's serialization rules, loses some detail, you can use rawJSON() to ensure it serializes with its full value. Once it becomes text, though, rawJSON() no longer has any effect on things. If you JSON.parse() the result in JS, you'll get back a Number with the normal Number precision limitations, losing some of the specified value.
Your proposal covers both sides - on serializing, it captures the fact that an object was a particular class, then on parsing, it rehydrates the value into that class, rather than a generic Array/Object. To do so, it has to leave its fingerprints on the serialized result, producing a JSON string that requires special knowledge to parse correctly. If you do a generic parse, you'll get a fundamentally wrong JSON structure.
This is quite different from rawJSON(), where you can do a specialized parse that captures the full numeric precision into a special numeric class, but if you parse it with a generic parser, you'll still get the same shape for the result, just possibly with less precision than was written into the string.
There is a lot to unpack in this thread. I have spent a decent amount of time on similar problems (including a very similar registry approach I prototyped for postMessage about 8 years ago).
First, let's all agree on what JSON is: a serialization of structured data, inspired from, but independent of JavaScript. The format itself will not change anymore, it is set in stone.
A very important distinction between the JSON spec and the built-in JSON support in JavaScript is actually the handling of number types. The JSON format allows for arbitrary number size and precision, whereas when parsing / stringifying in JavaScript, it natively handles IEEE 754 numbers. Emitting large numbers (or string with different encodings) was the motivation for rawJson, and it doesn't currently support emitting text that would parse as array/object types.
Since the format of JSON itself will not change, any solution to support custom types roughly falls in 2 categories:
- A different format altogether that produces string/bytes (and there are many, like CBOR, syrup, etc.)
- An encoding that produces structured data that can be serialized to/from JSON (or other formats)
What you are suggesting here seem to fall in the latter category.
In either case, the main question is how much interop you want with other systems. If you want any, what is really needed is to specify the encoding so that other systems can implement it. That is somewhat independent of the API that would be used in JavaScript.
On the other hand, something like structured clone doesn't specify its encoding because it doesn't need to interop with other systems. As mentioned in other comments, it would be a natural target if you want to focus on an API for extending type support, but in user land it comes with the implementation difficulty to wrap all places where the structured clone algorithm is exposed to user code (again if you want to avoid specifying the encoding of your extended type support).
When it comes to specifying a new/extended encoding (and exposing an API for it in JavaScript), the problem is how many use cases should it solve. For example, we have our own encoding supporting more types than JSON which is currently implemented as encoding to/from a JSON structure, but it needs to support some use cases that aren't simply mapped to your type registry approach (mainly referential identity inside the structure). There is also an effort to specify a format to support these use cases across various systems: Ocapn.
Where I'm trying to get at is that an API to support custom types either needs to specify and standardize some encoding (or use an already standardized encoding) for interop; or focus on a more self contained system like structured clone. For the former, unless some encoding comes out as widely adopted, I doubt its place is in the language itself.
I think structure clone is a potential target for your effort, but my constraints there would be to make sure the registry doesn't become global state. I like that the approach of instantiating the registry avoid that, but that somewhat re-introduces the problem of how parties should synchronize. I honestly don't have a good answer on what a standardized support for custom types should look like. I suspect some of it might involve module declarations to be able to easily share type implementations across workers.
Edit: I also forgot to mention that embedding bytes in a JSON based encoding is really inefficient. If keeping JSON as a main structure, one approach is to have "side data" that can be referenced from the main structure. That is effectively what we do for reference types in our marshaller, and what I've been wanting to do for bytes as well if we end up keeping a JSON structure as main encoding.
my current code/proposal around JSON produces JSON and works only with JSON. It's the equivalent of toJSON() except it produces a tree that can be deserialized so that compatibility with any other PL is identical to JSON, they just need to eventually implement that to/from dance but they can also ignore any of that as long as they know whoever produced JSON is now producing JSONRegisgtry capable JSON.
However, for performance and easier traveling I'd be more than OK with a similar structured clone approach but there's nothing special about it: the registry would return something compatible with structured clone, the only "flag" or byte different is that it's a custom type that can be represented easily as utf-8 encoded buffer (I am talking about the entity name, the map key) as long as the same custom type is present at the received level, otherwise it produces a non deserialized entry which is just fine: it can be ignored or parsed anyway.
The contract is: I register a name, I provide a way to filter references that will be handled as such, I convert that via to and I deserialize via from if that name is also registered at the receiver side. There is no travel data unless you decide to have travel data like postMessage allows already, recursion is up to structured clone (can't even remember if it supports it but that can be also solved on user-land like I've done with flatted and cirsular-json before, or flatted-view) so the mapping might be simpler than expected? I am struggling to see what is difficult around this API because it's a Map with unique keys and metadata to catch instances, "encode" these returning structured clone compatible data, "decode" this at the other end or pass just an Object or skip that revival entirely, I'd be OK with any decision in there, even throwing (but seems unnecessarly hostile).
To tackle the other topic, not "at the global state", luckily enough we have postMessage accepting an optsion setup/entry Window: postMessage() method - Web APIs | MDN
Without bikeshedding too much, this "registry" could be one of those options: it's a Map with unique key (as string) / value pairs where value must describe how to brand-check the entity, how to transform into something that can travel, and on the other hand the addEventListener("message") could also have options with the same (different realm) registry that indicates how to handle the revival of that event.data .
The "identity" concern I am not sure where it comes from or what is about, because structured clone "clones" references, it doesn't preserve identity, but whatever it does already, it should/would do the exact same with these traveling entities.
With this idea:
- there is no global registry to worry about
- there's minimal change to API that already accepts
optionsas argument - it's an opt-in approach like
transferis already, or anything that has been introduced incrementally
Would this idea work? I can't see a single issue with this approach, except, of course, a new option to discuss and shape in the possible best way that is both fast and not too much users' hostile.
A concrete example of how this could work ... no strong feeling around any name or logic I am proposing, but hopefully it won't need to be much different than this:
// the proposed API or anything similar
class StructuredCloneRegistry {
#registry = new Map;
constructor(registry) {
for (const [key, value] of Object.entries(registry))
this.register(key, value);
}
register(key, value) {
if (this.#registry.has(key)) throw new Error(`${key} already exists`);
if (typeof value?.is !== 'function') throw new Error('is must be a (value:unknown) => boolean');
if (typeof value?.to !== 'function') throw new Error('to must be a (value:unknown) => cloneable');
if (typeof value?.from !== 'function') throw new Error('from must be a (value:cloneable) => unknown');
this.#registry.set(key, { is: value.is, to: value.to, from: value.from });
}
[Symbol.internalAlreadyParsedRegistrySafeAndFast]() {
return this.#registry;
}
}
// registry.js
// the module that can be imported anywhere it's needed
// reason this is desirable as native class is that otherwise
// it would need to be parsed and guarded every single time
// the same contract travels back and forward ...
export const registry = new StructuredCloneRegistry({
signal: {
is: value => typeof value === 'signal' && (
Symbol.keyFor(value) !== undefined &&
Reflect.ownKeys(Symbol).contains(value)
),
to: value => Symbol.keyFor(value) !== undefined ? '@' : '!',
from: value => value[0] === '@' ? Symbol.for(value.slice(1)) : Symbol[value.slice(1)],
},
MyThing: {
is: value => value instanceof MyThing,
to: value => value.toJSON(),
from: value => new MyThing(value),
}
});
// main
import { registry } from './registry.js';
worker.postMessage(
[
Symbol.iterator,
new MyThing({any: 'thing'}),
],
{
registry
},
);
// worker
import { registry } from './registry.js';
addEventListener(
'message',
event => {
// optimistic example
const [iterator, myThing] = event.data;
console.assert(iterator === Symbol.iterator);
console.assert(myThing instanceof MyThing);
},
{
registry
},
);
// bonus ???
structuredClone(
[
Symbol.iterator,
new MyThing({any: 'thing'}),
],
{
registry
},
)[1] instanceof MyThing; // true
edit if anyone is wondering what's the symbol thing ... well, a registry should always check its entries before throwing or discarding values and I've been using traveling Symbols for 3 years now because of FFIs driven remotely (worker to main or server and back) ... so there's that too.
My recollection is that cycles are not that easy to handle with a registry approach since you need the ability to create an object first, keep it in a sort of TDZ, and then initialize it. That likely has an impact on the registry API
I think you missed my point. It's totally fine to implement in user land encodings like that leveraging JSON to solve your use cases, but unless there is a wider effort to specify and standardize that encoding, it does not belong in the languages themselves.
I was trying to point out that your chosen encoding and API is not sufficient to solve our use cases for example (support for opaque reference types that maintain identity on round trip). Also we admittedly have a different theory on custom type support in our encoding and implementation.
my whole idea was to create that standard but let's forget JSON and focus on the structuredClone approach, shall we? JSON used toJSON for "centuries" I don't buy an opt-in primitive would be problematic at all but that's OK, structuredClone is fully opaque ... now ...
structuredClone already handle cycles and recursion, the registry would simply map the ref object with the item returned by `to(ref:unknown):clonable` because that clone won't be known AOT, it's generated per object and all of this works already in flatted or flatted-view or even my structured-clone polyfill so I am not sure I understand the issue in there?
The API is opt-in, if it requires slightly lower performance I think it's better than any user-land hand-written and non-standard approach so we should be OK with it, I hope.
A quick Googling shows that JSON-LD might be a possible way to designate types inline, but it still seems to have some context stuff that is in a separate linked document. Not sure if that's mandatory for that spec.
In the same space, TJSON supports more types, but doesn't seem to support custom types.
Of course there is also JSON Schema, but the goal of that specification focuses more on detailing the type of every property of the document.
toJSON is not quite in the same realm of solutions. It's the object itself providing a replacement value, without coordination. It also only works one way. Here you're proposing a registry, which obviously must coordinate how types are wrapped / represented, and parsed.
Sure, but currently the structure clone algorithm doesn't need to trigger user code when reconstructing this object graph.
Imagine a graph like:
Person M {
name: "Me",
relatives: {
daughter: Person D {
name: "Daughter",
relatives: {
parent: Person M,
sibling: Person S
}
},
son: Person S {
name: "Son",
relatives: {
parent: Person M,
sibling: Person D
}
}
}
}
This can be serialized just fine by having the to produce a one-level deep structure still referencing external types that will have to called onto them.
On the other side tho, you have 3 Person custom type instances to recreate, that each need to be given references to some of the other instances (in this case it's all the same type, but there could very well be unrelated types forming the cycle). With an API doing single shot "create and initialize" for the instance, it's impossible for the from to be provided a structure/value containing revived references.
I want to be crystal clear about my current intent:
- I don't care about my initial JSON proposal ... you can name dozen alternatives, I wrote other dozens, that's the whole point I was trying to make. If we don't want JSON to ever improve, "it is what it is", I am OK with that, until the next primitive that would ever land in JS. Accordingly, should I create a new thread so that nobody will talk about JSON anymore? I'd like to move forward with the structuredClone counter-argument, that pays all my bills, thanks!
- I have mentioned already libraries used via 500M+ downloads from npm such as flatted and while I am sure you have concerns around the implementation, I can tell you I have already solved every concern with user-land code. Would it help if I spend time to implement the structuredClone registry idea I have in mind as polyfill anyone can try and test/use already? My time is not infinite but we need the ability to transform and revive custom types (including Proxies) since two years ago and keep writing and maintaing code for something yourself mentioned it's been 8 years ago (and every single CBOR, MessagePack, etc exists for the very same reason with custom types) so that I think in 2026 somebody should be able to
postMessagea custom type not natively supported, something that will remove burden forever from the engine too, because nobody will ever complain again something is not working as expected ... is this desirable?
I would personally find a postMessage polyfill interesting (as I mentioned I prototyped something exactly like this 8 years ago), however I'm not sure a polyfill would influence much whether this would get standardized.
First you have a venue problem. While there is an effort to move the structure clone algorithm maintenance to TC39, it currently remains specified by WHATWG. In any case, the postMessage API surface extension would likely remain a concern of WHATWG (or at best WinterTC).
Second, the most important for a proposal, with any standard group, is to formulate the problem statement, and show how common a problem this is for the broader community. A couple libraries implementing similar approaches to solve that problem is usually a good signal, as long as those libraries have sufficiently broad adoption. My own experiments are not sufficient motivation as it was more a personal exploration, but there are likely other projects like Comlink that could benefit from a standard custom type support.
I would however like to re-iterate my concern with your design, that IMO it doesn't support cycles when custom types are involved, and I see nothing in the documentation of flatted to indicate otherwise.
Finally another concern I have (and one reason I suspended my own experiment), is that depending on the API shape, authors leveraging this feature may expect the ability to maintain instance identity for custom type objects shared by postMessage. However this is currently impossible to do in userland without leaking memory (a bug Comlink is still affected by for example), and browser vendors have clearly expressed they are concerned with anything that would require them implementing any kind of multi-agent garbage collection (even though v8 has implemented a new garbage collector for shared structs, the restrictions are such that it is still a staged gc mechanism and not a fully distributed one, aka there is no support for shared-to-local edges, including in WeakMap keys).
TLDR: I think it's an interesting problem to solve, but I'm not a good representation of who you need to convince. And I believe that depending on the use cases you want to solve for (cycle support, identity preservation), it is not as simple a problem to solve as you imagine.
could you write a minimal example of what would not work? just to be sure I understand what is it that is problematic or not seeing myself, thanks.
I can do that via coincident but that's a different topic ... nothing cloned maintains identity so I am not sure why anyone would expect that ... an instance in the browser cannot be the same on the server, only orchestration via proxy and atomics can create that (synchronously) which is what coincident solves, but that's another thing via reflected-ffi that is a part from the fact I still need to parse proxies before these travel because I have no way to intercept the serialization process in structuredClone or postMessage, which is what this idea/proposal would like to solve.
edit coincident allows us from a Worker running Python (Pyodide) or MicroPython to have 1:1 identity with the main thread APIs and DOM in a synchroonus way, still that's an FFI (a remote one) responsibility, not the postMessage one.
My Person with relatives example above is an example. Another one: Foo F { bar: Bar B { foo: Foo F } } where the Foo and Bar types are defined completely independently. The recreate the instance you need the payload with values already deserialized in the call to from which isn't possible if you have a cycle inside those.
Sure but passing the same object twice in a single message results in the same object received, so it's not completely out of line for some users to want the same object passed over multiple messages to show up as the same received object, or as the same original object when sent back. This is a need that shows up more often when dealing with class instances.
That said this is likely only needed for a subset of use cases, and can probably be built into each type that needs it. The opaque nature of the postMessage interface is also particularly well suited to facilitate the GC on behalf of users instead of requiring users to deal with WeakRef and FR, which is guaranteed to leak when you introduce cycles over RPC.
Anyway, that's all an orthogonal problem to handling custom types.
I meant a concrete example I can show you it's possible, thanks!
is this currently happening with postMessage ??? I never knew about it but, if that's the case, what's that for or what's the use case around this?
because we don't have any way to detect that at the receiving side of affairs, how is that useful to anyone? I am really trying to understand why that's the case for an API called structured clone where I'd never expect a clone to be the same I already "cloned" before, happy to learn more around this quirk, is nowhere documented for what I can remember, interesting thing I wasn't aware of, full of possible hidden footgun to me too, because same reference doesn't mean same inner fields values with it ... right?
Absolutely, specially because if I never knew about this "feature" I doubt many others out there know, or ever needed, such feature.
on the other hand, with a registry which goal is to convert any value to something else, I guess whoever needed that identity preserved on the other hand, can simply trust those kind of expectations are preserved, anything else that traveled converted by the API is restored as expcted instead, the "unique identifier" in that case can be easily hidden within any field that's part of the restored instance. Still I am curious which project or library needed that and why, because that behavior feels like more a shenanigan than a feature to me, I've been postMesaging for 5+ years at this point, that's absolutely unexpected to me!
I've tried the surprise I wasn't expecting and indeed is not happening ...
var o = {};
var wut = structuredClone({o}).o;
structuredClone({o}).o === wut;
// FALSE
so I don't understand what we're talking about presering identities or stuff ... I hope it's not arguments for the sake of arguments because my idea didn't come from somebody working with browsers or engines ... otherwise I don't understand anymore what's this space about, and just today I had a Mozilla counter-bug mentioning my original bug that was closed because "reasons" that were all valid, today they are trying to fix those reasons, 1 year later I've opened that discussion ... please elaborate why pushing back for things not even in the current state are acceptable in here, thank you!
It falls off the cycle support algorithm.
const foo = {};
const clone = structureClone({ bar: foo, baz: foo});
clone.bar === clone.baz; // true
clone.bar !== foo; // obviously true
When sending foo over multiple message, the question on whether it should result in the same cloned object showing up on the receiving end is really use case driven when creating transparent RPC APIs over postMessage. As I admitted this is an orthogonal problem, we don't need to dive deeper into it here.
if that's it, I have fixed that already and I can show code that handles that with ease, I still don't see the problem, honestly ... the ref travels as one and get resurrected as one, same way it traveled. The first time it's encountered as data to resurrect it maps that source ref to the resulting data, what is so difficult in there?
I mean ...
import { decode, encode } from 'https://esm.run/flatted-view';
const structureClone = value => decode(encode(value));
const foo = {};
const clone = structureClone({ bar: foo, baz: foo});
console.assert(clone.bar === clone.baz);
console.assert(clone.bar !== foo);
where is the issue?