I'll try to explain better what I am doing and why all these caveats matter to me.
Let's take the most basic code example but please pay attention to the explanation:
// this must be an array
const cpus = require('os').cpus();
// it needs to survive Array.isArray(cpus)
// it must be iterable too
// each CPU must be an object where each property
// might be an object or a primitive
for (const cpu of cpus) console.log({...cpu});
There is no surprise in this tiny snippet that one can copy and clone in node
repl and see results but what you might not expect is that this code runs in a Worker that communicate directly via Web Socket to either NodeJS or Bun and every reference doesn't exist in the Worker, it's simply a Proxy of the reference created in the NodeJS / Bun interpreter and Garbge Collected when it's not needed/used/reached within the Worker anymore.
The cpus looks like an array, acts like an array, but in the Worker is actually just a thin Proxy to a unique identifier: ["array", 123]
Whenever any operation happens in the Worker code Atomics ask to the main thread to ask via sockets to the forein interpreter to execute that Proxy trap operation in the real reference held until GC'd, and the same goes for anything this remote array returns per each interaction but in this case we have objects: {type: "objet", value: {....}}
.
┏━━━━━━━━━━━━━━┓
┃ ◂━ Worker ◂━╋┓
┗┳━━━━━━━━━━━━━┛┣ parse
┣ Proxy trap ┃
┣ Atomics wait ┣ notify
┏┻━━━━━━━━━━━━━┳┛
┃ Main ┣◂┓
┗┳━━━━━━━━━━━━━┛ ┃
┣ Web Socket ┃
┏┻━━━━━━━━━━━━━┓ ┃
┃ NodeJS / Bun ┃ ┃
┗┳━━━━━━━━━━━━━┛ ┃
┣ Apply trap ┃
┣ stringify ┃
┗ WS Result ━━━━┛
Please note I've used node and bun as example because that's easier to understand from a JS developer point of view, but you can replace Web Sockets and server with pyodide or MicroPython interpreters with their own FFI and the dance is basically still the same ... or just how fully driving the real DOM from a Worker happens, chopping on the 3rd indirection still the dance is the same.
Also note this is not hypothetical, this is how polyscript works and polyscript fuels PyScript.
Constraints
- all objects must behave exactly like objects
- all arrays must behave exactly like arrays
- all methods and functions and classes must behave the same too
- all primitives must be able to travel but also buffers
The latest point means that in an ideal world, where engines would be so kind to expose their structuredClone serializer/deserializer utilities, I could use just those primitives and be done but as reality kicks in I need to use @ungap/structured-clone/json parse
and stringify
utilities to survive types not compatible with JSON and on top of that I want objects to fully reflect their source nature, meaning that if {notYet: undefined}
is referenced, in the Worker the notYet
key must be present and the undefined
value carried along.
Add bigint and known symbol survival (Symbol.iterator
to name one) and you see that this architecture is screaming for a common way to define type / value pairs that survive all sort of indirections and common issues with JSON or structuredClone.
Previously ...
The initial implementation of coincident (which makes polyscript possible, hence PyScript too) used the [type, value]
convention (for non function cases) to describe the desired type and behave accordingly with traps and/or arguments or returned values deserialization.
This revealed the issue with Array.isArray
first, so that I've monkey patched within the Worker (hence not nearly as bad as a global main polyfill yet still ugly) but then we had an example where a main thread library, that we don't control and couldn't fix for workers, that was failing hard with ["object", {...}]
references because the traps such as ownKeys
and the one around descriptors wanted a length
non configurable property so that any object out there with an actual length
property, for whatever reason, would've failed the roundtrip dance and at the Proxy level.
The TL;DR is that patching in each Worker Array.isArray
wasn't good enough, so we had to disambiguate between the Array, the Object, and the Function case, which are the main 3 types the Proxy handles and drills arbitrarily, with also caveats for the apply
VS construct
dance, but those errors can be lazy too so it's less of an issue.
Current state
I wrote yet another "shouldn't be necessary but here we are" library called proxy-target which resolves everything I've encountered to date by providing utilities that creates [value]
when the foreign reference is an Array, {t, v}
pairs to describe every other type including null
and undefined
, and Ctx.bind(value)
where Ctx
is a function like this:
function Ctx() {
'use strict';
return this;
}
We can now bind remote functions references with integers or strings (see use strict
not accidental) and intercept all function traps and never break for objects or arrays, so we removed the need for an Array.isArray
patch and now objects acts 100% as objects and no extra checks or operation is ever needed on our side.
As summary
I am confident almost nobody will read any of this length explanation about why I've been filing issues and why all these things are just unnecessary friction and most of the time undesired to be forced to solve in user-land ... so that while I am sure nothing will ever change in the Proxy space at least anyone interested in the story around why proxy-target, coincident and polyscript needed a better specification based solely on traps intents and not target drilling surprises could have an answer.
Regards
P.S. if anyone is interested in "what could you do with such complex stack?" this video should easily answer that: https://twitter.com/WebReflection/status/1678762388538155013 it's a worker using synchronous NodeJS APIs to orchestrate via DOM client/Raspberry Pi 2W Zero results.