(Champion needed) Improve handling of network data (2)

To continue discussion (Champion needed) Improve handling of network data:

Proposals: JSON.parseBinary and ArrayBuffer.prototype.detach

I should mention that I posted them in Proposals category, while Ideas was the right one.
So you can look at these proposals, probably the initial discussion as well and ask any questions. Currently I am looking for TC39 delegates, who can champion these ideas. If you are one of them, you can reach out to me here, on GitHub, email.

1 Like

Discussion is stuck. Rather still empty. Hello everyone :waving_hand: . My proposals are still seeking champions)

Not my area of expertise but how does this compare to using the web Response API?

That returns a promise to support streaming, whereas parseBinary is synchronous I assume. But apart from that would it be similar to doing:

new Response(arrayBuffer).json();

Thanks for asking, no matter the expertise.
There is no publicly available API where this is documented (or I can’t find one) and there is no evidence that there is a ready binary parser, however docs can be treated as follows:

here Response: json() method - Web APIs | MDN it is mentioned that “json” method waits until stream ends and parses body text into JS structures. “Body text” most likely means JS string, otherwise it would be said “accumulated binary”. So asynchronous here is just an accumulation of payload. Parsing is synchronous

This API is like a polyfill in node-fetch, so underneath it uses JSON.parse + Buffer.toString, not parsing binarh right away node-fetch/src/body.js at 8b3320d2a7c07bce4afc6b2bf6c3bbddda85b01f · node-fetch/node-fetch · GitHub

While the polyfill may use JSON.parse and utf-8 decoding internally, that doesn't mean an actual implementation in browser or WinterTC environments wouldn't be optimized (and they likely are).

I could see other environments needing to optimize JSON parsing from binary data that isn't fetched, but the use cases that care would possibly be able to natively implement an optimization as well.

Do you have any examples you could point at where such an API would be needed?

Sure! All example are located here:

And even IF browser does already have internal binary parser, that is a big plus for JSON.parseBinary because there is no need to implement it from the start, but I can’t find it anywhere. And also if it did, why would proposal “ import json files” mention decoding text and using JSON.parse? Wouldn’t binary parser be the perfect case here? In the spec look for 1.2.5 (bottom of the page) JSON modules

Those examples contain both JSON.parseBinary and ArrayBuffer.prototype.detach.
However there are benchmark tests in these folders

Their running results are described in READMEs as well as in GitHub Actions (peculiar commits, haha)

JSON.parseBinary

ArrayBuffet.prototype.detach

You are right saying that parsing JSON is optimised, but it is not said about TextDecoder or failure path. Even if they are, my benchmarks prove that their optimisations are useless if can be omitted completely (SyntaxError, GC for intermediate strings)

Simple comparison: when someone asks you to raise a hand, you don’t jump before raising a hand, because why?
TextDecoder discussion about its low performance

Error stack trace overhead. In the case of JSON.parseBinary - omitted

In addition to everything above, this idea got its support from the Fastify itself

Given all this, I am looking for a delegate, willing to (co-)champion these ideas or at least present them on a future meeting of the technical committee. Would anyone like to make this happen?

You lay out the problems fairly well, but there's a big one left that your design doesn't solve: you still are stuck needing (more or less) 2x the memory to get a parse result.

The moment when you have the full 2x memory use is just before the end of the parse: one complete copy of the JSON structure is in the buffer, and one complete copy is in the object you're about to return as a result.

In addition to the memory use you have a lot of dead time where you're receiving data into a buffer but you can't do any other work until you've received it all. You've got work that needs to be done and resources sitting idle, but in your design you're unable to use the resources to do the work.

I strongly, strongly recommend that you implement your proposed JSON parser using stream iterators as the input type, which solve ALL the problems you listed and ALL the problems I mentioned: the 2x memory usage and wasted wall clock time

1 Like

And I have already answered before, that streaming brings more drawbacks. Furthermore, you can use websockets for several chunks, that have to be processed one at a time.

Consider this case: some report is sent to a server, which is 20 MB large. Indeed, not a case where this has to be put into memory as one buffer. But when this report is streamed, it is difficult to do anything before it fully arrives. For example, to determine whether this report is indeed a report, we need to verify whether it has all keys and values - validate JSON. There is no guarantee that what we get from the wire is valid + syntax errors, which I try to avoid by not throwing SyntaxError. Also, there are cases where several key-value pairs are interdependent. JSON has no guarantees considering the order of pairs.
The solution is simple: create a websocket-like connection, which sends those dependent pairs as separate JSON payloads, they get validated and processed one by one, server sends “MORE” to the client and this procedure continues until everyone is satisfied. By the way, JSON.parseBinary and ArrayBuffer.prototype.detach still fit here, because data comes in separate complete payloads, which can be easily parsed with “JSON.parseBinary” and initial buffer can be detached.

The case above describes scenario where there are multiple key-value pairs. You might have thought about “one huuuuge key-value” scenario, to which this can’t be applied. Streaming should return “key+value” pairs together, but here this would mean parsing the whole payload. No benefi over JSON.parseBinary. And note, that intermediate string does not get created in proposed variant, so performance gains are literally everywhere

That answer from before is still valid. JS has to keep the parsing state somewhere. I will paste it here.

chunk 1: { “some string”: “its contents, that are not full
chunk 2: ; this is still that text :grinning_face:, not full
chunk 3: ; finally end” }

One more nuance. Each callback involves creating another closure, making Control Unit inside the PC perform unnecessary jumps to that callback. Just like with branching (as is told in every forum) in C++ - recommended to be avoided. And still, passing callbacks here would require making “if-else” branching, depending on those keys. So callbacks/iterators don’t optimise this.

Such conversation about streaming was already held

According to the results, there was quite a lot of work to perform before reaching “championed” state. So much, that it did not.
Deno was mentioned there because it already provides some streaming for json - @std/json - JSR
1. It mandates TextDecoderStream as the first step. Every example in the docs starts with .pipeThrough(new TextDecoderStream()). This is an intermediate string. @std/json does not parse binary - it parses strings that have already been decoded. That is the first problem I try to solve.
2. It is designed for NDJSON/JSON Lines, not single-document JSON. JsonParseStream treats each chunk as a complete, independent JSON document. ConcatenatedJsonParseStream handles back-to-back documents, not a single document split across chunks. Not an option for this debate.

3. It still throws SyntaxError. Under the hood each chunk is passed to JSON.parse(). Invalid input throws, and every SyntaxError costs the same 500Ă— overhead my errors.mjs benchmark proves. Nothing changed for failure path.

4. Infrastructure overhead on top of the existing problem. Each TransformStream has its own internal queue, backpressure machinery, and microtask scheduling overhead. For a request body that is already fully buffered in memory - which is the normal HTTP case - piping through three transform streams is strictly slower and more memory-intensive than a single synchronous call.

Conclusion - existing streaming engines still don’t solve the problem, were cumbersome to implement and are cumbersome to propose.
“I have JSON from network, I want to parse it without overhead” - this is what I try to solve. Streaming topic is more about architecture choices (websockets) and not actual parsing of payload (but my ideas complement streaming well).

Lastly, this big problem that my design does not solve - we are stuck with 2x memory to parse the input.
Perfect case - 1X memory, zero-copy way of doing things. Is this possible in JS - no. If we receive ArrayBuffer, we can't make a "string view" out of it. As was mentioned here (Champion needed) Improve handling of network data - #15 by markm, this could be possible with immutable arrayBuffers, but remains a dubious decision.
Even if I did create a streaming parser, it would still have to make copying. The only change is that input parts are Garbage Collected a bit sooner (no guarantees though). JSON.parseBinary makes a move from the status quo. More performant move than other alternatives. Pursuing zero-copy path is not possible today, that is why my design solves everything it should

it is difficult to do anything before it fully arrives

Well as I mention decoding and parsing are the work that can be done incrementally.

Streaming should return “key+value” pairs together, but here this would mean parsing the whole payload.

If you are saying that you now understand the problems with idle resources and 2x memory use, then I think it should be obvious to you that it will be necessary to allow the chunks of data to be fed to the parser individually, right?

let result = binaryJSONParser.feedChunk();
if (isNeedMoreData(result)) {
  // do more parsing
} else {
  return result;
}

Is this possible in JS - no.

I'm actually doing it. Not just theory, I'm actually doing the thing you say is impossible.

You seem to be accusing me directly of being a liar. Are you so completely sure that it's impossible that you're going to spit at someone who has literally done the thing you claim cannot POSSIBLY be done?

I have no intention of accusing anyone. If you do such technological solution, then please, provide more details on it. If you describe it more clearly, I may change my opinion.
For example, give code example, describe lifetime of these strings, immutability.

I'm not super eager to go to work for you given that you've been so dismissive.

Here's half of what you want though: enough that I think a clever person such as yourself would be more than capable of seeing how the impossible problem is solved and could take it the rest of the way from here

import { StreamGenerator } from '@bablr/stream-iterator';
import { decodeUTF8, readFile } from '@bablr/fs';

let maybeWait = (maybePromise, callback) => {
  let isPromise = maybePromise instanceof Promise
  return isPromise ? maybePromise.then(callback) : callback(maybePromise);
}

function* __streamParseJSON(bytes) {
  // turn stream of bytes into stream of utf8 code points
  let iter = decodeUTF8(bytes)[Symbol.for('@@streamIterator')]();
  let step;
  let results = [];
  let topStr = '';

  for (;;) {
    // you can see that we don't hold any past data
    // each character evolves the parser state and then is discarded
    step = iter.next();
    // this line allows us to wait for new chunks when needed
    if (step instanceof Promise) step = yield wait(step);
    if (step.done) break;

    let chr = step.value;

    switch (chr) {
      case '{':
        results.push({});
        topType = 'object';
        break;
      case '[':
        results.push([]);
        topType = 'array';
        break;
      case ']':
        results.pop();
        break;
      case '"':
      case "'":
        results.push('');
        topType = 'string';
        break;
      case 'n':
        topType = 'null';
        break;
      case '+':
      case '-':
      case 'I':
        topType = 'number';

        break;
      default:
        // handle chr: add it to the top array or object
    }
  }

  return results[0];
}

const streamParseJSON = (input) => {
  let iter = new StreamGenerator(__streamParseJSON(input));

  return maybeWait(iter.next(), (step) => step.value);
}

// when the input is one chunk, you can get the result right away
let result = streamParseJSON('["ok"]'); // ['ok']

// when the input is in many chunks, the result is a promise
streamParseJSON(readFile('fixture')); // Promise
1 Like

This solution is built on stream iterators, which I also call "spacetime iterators".

A sync iterator is a space iterator. It lets you iterate through a sequence of things so long as you have them all in front of you right now so that you could conduct the iteration by pointing to each item in turn.

Space iteration is fast, but it breaks down when there's a time element -- that is to say when there exists no one time at which every item in the sequence is in front of you.

Next up we have async iteration, which is time iteration. Each thing that comes from a time iterator is a different moment in time. Where space iteration was like pointing to different items in front of you, time iteration is sort of like your pointing arm is fixed in position, but different items can be moved in front of it so that at different moments in time you will be pointing to them.

But time iteration is slow compared to space iteration. Most of the time the pointing finger isn't pointing at anything but air as it's just waiting for the next object to show up in front of it.

Spacetime iteration is the ideal: the best of both worlds.

Think of it like you're trying to chop up cucumbers. You have a chef and an assistant. The chef has space in which to work, a cutting board large enough for one cucumber at a time. The chef can make 10 slices to one cucumber in one second, and the assistant can provide a new vegetable once a second.

With space-only iteration the amount of work the chef can do is constrained by the size of the cutting board.

With time-only iteration the chef can only make one cut before the assistant must take the 1 second penalty to get a new cucumber.

With spacetime iteration, the chef cuts one cucumber up in a second, then spends a second waiting for the next one, cuts that up in a second... With the marriage of time and space comes efficiency in stream processing : )

1 Like

Thanks for sharing the example - I want to understand it correctly rather than dismiss it, so let me explain what I see and where I might be wrong.

The zero-copy I'm describing is specifically this: a JSON string value like "ok" would, in the ideal case, be a JS String object containing only an offset and length into the existing binary payload - no bytes copied at all. This is what proposal-immutable-arraybuffer makes theoretically possible, and what the TC39 delegate, Mark Miller, in connected thread assumed unlikelihood of implementing this by engines. That is the zero-copy I claimed was impossible today. If BABLR achieves something different, like offset-based token references within an already-decoded string, I'd like to understand it, because those are different layers of the same problem.

On the code itself: the decode(bytes) call at the top holds a reference to bytes for the entire duration of parsing. The characters being discarded from the iterator doesn't free the underlying buffer - it only makes individual characters unreachable, which are eligible for GC but not immediately freed.

The spacetime iteration model makes sense when data arrives over time from the internet, minimizing idle time and putting "chef" to work incrementally. But let's take into account NodeJS single-threaded nature, event loop and potentially waiting http clients. Let's say that we have "MAIN" client, who sends "30" identical chunks of payload to NodeJS server, spending X time on parsing each chunk.
In case where MAIN client is the only one over the wire, or when other HTTP requests don't accumulate into a large queue, spacetime model utilizes CPU more, leading to faster response time of the request.
But there is a case, where we have 30 chunks, each separated by another http client. Let's assume that "OTHER client" takes 0 time to be processed. Visually:

MAIN chunk 1 -> OTHER client -> MAIN chunk 2 -> OTHER client ...
In the end we have: -> MAIN chunk 30 -> LAST client

We have 2 ways of handling it: streaming (spacetime) or "accumulate and parse in one go".
Spacetime way means that after chunk 1, all clients are postponed on X time because of parsing that chunk + some time for event loop change. After second chunk, all coming clients are postponed by 2X time. MAIN client waited 30X time to get final response + other clients, LAST waited 30X time as well.
Due to streaming, parsing state is kept in memory (I refer to __streamParseJSON(bytes)), with the best case keeping leading to max memory of 31 chunks (29 parsed chunks, 1 chunk arrived + 1 chunk parsed afterwards).
The worst case - 1 giant key-value pair, which is accumulated in a parsing state through all the time. It is either parsed incrementally, but has to reallocate as soon as other chunk arrives, or accumulates until parsing it "in one go" - no different from second approach. So we potentially save ourselves memory, but can still reallocate tons of times OR introduce no change by parsing 'in one go".
"in one go" means following: MAIN client allocates virtual memory (Buffer.allocUnsafeSlow) of 30 chunks on the first arrived chunk and gets activated on the fly (hence it is virtual - I have this proved in ArrayBuffer.prototype.detach benchmarks). Each subsequent chunk gets copied to the buffer to its specific offset (just keep counter throughout the request) - nanoseconds, negligible. "other" clients are almost not postponed at all.
When MAIN client has final chunk, parsing happens (30X chunks * ~2 - ~60chunks roof) and then immediately 30X get cleared by ArrayBuffer.prototype.detach. CPU time for "LAST" client does not change at all, and all previous requests come as they were supposed to.
What is easier to implement - "one go" choice, which hurts less clients and to which streaming introduces almost no CPU time difference. Streaming clearly wins for "idle" server, but that is rarely the case.

Streaming (spacetime) JSON.parseBinary + detach
Total CPU time 30X (identical) + keeping state + copying (identical)
Event loop hits 30 - many small 1 blockage
Busy server 30 interruptions, all clients 1 interruption, fewer affected
accumulate scheduling overhead
Idle server ✓ better — no idle waiting neutral
Memory: chunks 1–31× chunk (sliding window) 30× chunk (full buffer)
Memory: parse spike variable: 1Ă— best case, ~60Ă— chunk at parse moment
60Ă— worst case (spanning str) (buffer 30Ă— + extracted vals 30Ă—)
Memory: after parse parser state until stream end ~30Ă— chunk (object only)
Memory: after detach N/A — no detach ~0× + object (buffer released)
Worst case memory identical to "one go" with 60Ă— spike, immediately drops
added streaming overhead to object size on detach()
Parser state kept alive across all ticks none — stateless call
API change async result for chunked input synchronous, drop-in replacement
Implementation state machine + StreamGenerator copy chunks + one parse call
Caller code change required (Promise handling) none

Here we decide between "little benefit" and triviality. If you meant the streaming, where pairs are processed as soon as are parsed, that adds overhead from strings + unguaranteed JSON order with interdependent values - I have already mentioned that.

The more interesting observation is about what happens on the invalid-payload path. V8's JSON.parse uses property_stack_ (source) - a C++ buffer that accumulates JsonProperty handles as the parser scans. BuildJsonObject is only called when a closing } is found, at which point V8 allocates the JS object with exactly the right amount of space for its named properties. V8 If parsing fails before } is reached, those handles are simply discarded - no JS objects were allocated yet for that incomplete container. This means for the failure path, V8's existing parser already avoids allocating the JS objects. What it cannot avoid - and what JSON.parseBinary targets - is the full intermediate string that was decoded before parsing even began, and the SyntaxError with its stack trace that is constructed regardless.

A character-by-character streaming approach does reduce the memory ceiling for the valid path by allowing early chunks to be released. In fact, "for await ... of" is the standard for this, totally understandable + cucumber example. But it still allocates per-character strings during string value accumulation unless the engine provides special support. JSON.parseBinary is a language-level synchronous primitive that eliminates the intermediate string for the already-buffered case, which is the dominant pattern in HTTP servers. Neither mine nor your, Conrad, makes the other unnecessary.

You are free to correct me, if I forgot to mention something or didn't understand you.

I hope you don't mind if I take what you're saying a little at a time cause there's a lot here.

The zero-copy I'm describing is specifically this: a JSON string value like "ok" would, in the ideal case, be a JS String object containing only an offset and length into the existing binary payload - no bytes copied at all.

The binary payload is unencoded. It could be utf8 or it could be utf16 or it could be Windows 1251 or really anything. I don't see how you're going to be able to present that un-encoded value as a primitive string.

From everything I've heard you say the result of your ideal zero-copy parse would (have to) look like this:

JSON.parseBinary('"ok"'); // ArrayBuffer

JSON.parseBinary('{ "msg": "ok" }'); // { msg: ArrayBuffer }

Let's start with this much: is this what you are proposing or not? If not, how are we dealing with encoding?