Thanks for sharing the example - I want to understand it correctly rather than dismiss it, so let me explain what I see and where I might be wrong.
The zero-copy I'm describing is specifically this: a JSON string value like "ok" would, in the ideal case, be a JS String object containing only an offset and length into the existing binary payload - no bytes copied at all. This is what proposal-immutable-arraybuffer makes theoretically possible, and what the TC39 delegate, Mark Miller, in connected thread assumed unlikelihood of implementing this by engines. That is the zero-copy I claimed was impossible today. If BABLR achieves something different, like offset-based token references within an already-decoded string, I'd like to understand it, because those are different layers of the same problem.
On the code itself: the decode(bytes) call at the top holds a reference to bytes for the entire duration of parsing. The characters being discarded from the iterator doesn't free the underlying buffer - it only makes individual characters unreachable, which are eligible for GC but not immediately freed.
The spacetime iteration model makes sense when data arrives over time from the internet, minimizing idle time and putting "chef" to work incrementally. But let's take into account NodeJS single-threaded nature, event loop and potentially waiting http clients. Let's say that we have "MAIN" client, who sends "30" identical chunks of payload to NodeJS server, spending X time on parsing each chunk.
In case where MAIN client is the only one over the wire, or when other HTTP requests don't accumulate into a large queue, spacetime model utilizes CPU more, leading to faster response time of the request.
But there is a case, where we have 30 chunks, each separated by another http client. Let's assume that "OTHER client" takes 0 time to be processed. Visually:
MAIN chunk 1 -> OTHER client -> MAIN chunk 2 -> OTHER client ...
In the end we have: -> MAIN chunk 30 -> LAST client
We have 2 ways of handling it: streaming (spacetime) or "accumulate and parse in one go".
Spacetime way means that after chunk 1, all clients are postponed on X time because of parsing that chunk + some time for event loop change. After second chunk, all coming clients are postponed by 2X time. MAIN client waited 30X time to get final response + other clients, LAST waited 30X time as well.
Due to streaming, parsing state is kept in memory (I refer to __streamParseJSON(bytes)), with the best case keeping leading to max memory of 31 chunks (29 parsed chunks, 1 chunk arrived + 1 chunk parsed afterwards).
The worst case - 1 giant key-value pair, which is accumulated in a parsing state through all the time. It is either parsed incrementally, but has to reallocate as soon as other chunk arrives, or accumulates until parsing it "in one go" - no different from second approach. So we potentially save ourselves memory, but can still reallocate tons of times OR introduce no change by parsing 'in one go".
"in one go" means following: MAIN client allocates virtual memory (Buffer.allocUnsafeSlow) of 30 chunks on the first arrived chunk and gets activated on the fly (hence it is virtual - I have this proved in ArrayBuffer.prototype.detach benchmarks). Each subsequent chunk gets copied to the buffer to its specific offset (just keep counter throughout the request) - nanoseconds, negligible. "other" clients are almost not postponed at all.
When MAIN client has final chunk, parsing happens (30X chunks * ~2 - ~60chunks roof) and then immediately 30X get cleared by ArrayBuffer.prototype.detach. CPU time for "LAST" client does not change at all, and all previous requests come as they were supposed to.
What is easier to implement - "one go" choice, which hurts less clients and to which streaming introduces almost no CPU time difference. Streaming clearly wins for "idle" server, but that is rarely the case.
|
Streaming (spacetime) |
JSON.parseBinary + detach |
| Total CPU time |
30X (identical) + keeping state |
+ copying (identical) |
| Event loop hits |
30 - many small |
1 blockage |
| Busy server |
30 interruptions, all clients |
1 interruption, fewer affected |
|
accumulate scheduling overhead |
|
| Idle server |
✓ better — no idle waiting |
neutral |
| Memory: chunks |
1–31× chunk (sliding window) |
30Ă— chunk (full buffer) |
| Memory: parse spike |
variable: 1Ă— best case, |
~60Ă— chunk at parse moment |
|
60Ă— worst case (spanning str) |
(buffer 30Ă— + extracted vals 30Ă—) |
| Memory: after parse |
parser state until stream end |
~30Ă— chunk (object only) |
| Memory: after detach |
N/A — no detach |
~0Ă— + object (buffer released) |
| Worst case memory |
identical to "one go" with |
60Ă— spike, immediately drops |
|
added streaming overhead |
to object size on detach() |
| Parser state |
kept alive across all ticks |
none — stateless call |
| API change |
async result for chunked input |
synchronous, drop-in replacement |
| Implementation |
state machine + StreamGenerator |
copy chunks + one parse call |
| Caller code change |
required (Promise handling) |
none |
Here we decide between "little benefit" and triviality. If you meant the streaming, where pairs are processed as soon as are parsed, that adds overhead from strings + unguaranteed JSON order with interdependent values - I have already mentioned that.
The more interesting observation is about what happens on the invalid-payload path. V8's JSON.parse uses property_stack_ (source) - a C++ buffer that accumulates JsonProperty handles as the parser scans. BuildJsonObject is only called when a closing } is found, at which point V8 allocates the JS object with exactly the right amount of space for its named properties. V8 If parsing fails before } is reached, those handles are simply discarded - no JS objects were allocated yet for that incomplete container. This means for the failure path, V8's existing parser already avoids allocating the JS objects. What it cannot avoid - and what JSON.parseBinary targets - is the full intermediate string that was decoded before parsing even began, and the SyntaxError with its stack trace that is constructed regardless.
A character-by-character streaming approach does reduce the memory ceiling for the valid path by allowing early chunks to be released. In fact, "for await ... of" is the standard for this, totally understandable + cucumber example. But it still allocates per-character strings during string value accumulation unless the engine provides special support. JSON.parseBinary is a language-level synchronous primitive that eliminates the intermediate string for the already-buffered case, which is the dominant pattern in HTTP servers. Neither mine nor your, Conrad, makes the other unnecessary.
You are free to correct me, if I forgot to mention something or didn't understand you.