Shared Array Buffer Pool

Hi, In terms to make multithreading easier I want to propose a new data structure.
Instead of proposing complex stuff, this structure looks as simple as possible.

So, the main idea is to have an array pool that contains Shared Array Buffers. (SABP next)

The whole pool can be shared as it is (without cloning, but as a reference) with other contexts (e.g. Workers, IFrames, etc.)

Each method of the proposed object works in a way as Atomics works.
So all the methods guarantee that they will be executed properly in each worker/context/etc.

Overall SharedArrayBufferPool API:

interface SharedArrayBufferPool {
  // Number of allocated bytes by this object
  allocatedBytes: number;

  /**
   * Allocates new buffer at free array pool index
   * Returns its index (uint32 or uint64, depending on OS or implementation)
   */
  allocate(byteLength: number): number;

  /**
   * Reallocates a buffer at the provided index (in place)
   * If a buffer does not exist at provided index, create a new one
   * If newSizeInBytes is larger than the old buffer size, copy the old data and fill the new space with 0
   * If newSizeInBytes is smaller than the old buffer size, slice the old buffer data
   */
  reallocate(index: number, newSizeInBytes): number;

  /**
   * Deletes SharedArrayBuffer by an index.
   * Returns true if deleted successfully, false otherwise.
   */
  delete(index: number): boolean;

  // Return a buffer by an index, or undefined if buffer does not exist.
  get(index: number): SharedArrayBuffer | undefined;

  // ... other help methods
}

What does it solve?

The main idea is to minify worker synchronizations.
So all the workers can rely on this data structure as a memory container.

As it returns static indices of buffers, those indices can be used as pointers. (Instead of providing real pointers to the real RAM)

Here is a simplified example of a problem that SABP can solve:

Code
// Worker.ts
interface TestData {
  signal?: SharedArrayBuffer;

  // Assume, the next properties are dynamically allocated

  // {x: float32, y: float32, radius: float32}
  circles?: SharedArrayBuffer;

  // {x: float32, y: float32, width: float32, height: float32}
  rectangles?: SharedArrayBuffer;

  otherRuntime3DNodes?: SharedArrayBuffer;
}

const data: TestData = {}
const gl: WebGL2RenderingContext = offscreenCanvas.getContext("webgl")

const loop = async () => {
  const signal = new Uint8Array(data.signal as SharedArrayBuffer)

  await Atomics.waitAsync(signal, 0, 1) // Wait for a Render signal

  if (data.circles) {
    const arr = new Float32Array(data.circles)
    drawAllCircles(gl, arr)
  }

  if (data.rectangles) {
    const arr = new Float32Array(data.rectangles)
    drawAllRects(gl, arr)
  }

  Atomics.exchange(signal, 0, 0); // Signal 0, e.g. UI signal
  Atomics.notify(signal, 0); // Notify other

  loop()
}

self.addEventListener('message', (e) => {
  for (const key in e.data) {
    data[key] = e.data[key]
  }

  if (data.signal) loop();
})





// Main.ts
import MyWorker from 'workers'

const worker = new MyWorker()

const data = {
  signal: new SharedArrayBuffer(1)
}

worker.postMessage(data)

const loop = () => {
  if (Atomics.load(signal, 0) === 0) { // Signal 0, e.g. UI signal
    updateSomethingLikePhysics();

    Atomics.exchange(signal, 0, 1); // Signal 1, e.g. Render signal
    Atomics.notify(signal, 0); // Notify other
  }
  requestAnimationFrame(loop)
}

loop()


UI1.onClick = () => {
  if (!data.circles) {
    data.circles = new SharedArrayBuffer(12 * 10); // 10 circles

    // SYNC HERE!
    worker.postMessage(data)
  }
}

UI2.onClick = () => {
  if (!data.rectangles) {
    data.rectangles = new SharedArrayBuffer(16 * 10); // 10 rectangles

    // SYNC HERE!
    worker.postMessage(data)
  }
}

As you might see from the code that to pass a new buffer to a worker we need to synchronize them through the postMessage

SABP allows us to prevent useless synchronizations and make code more straightforward declared.

Here is an example:

Code
// Worker.ts

interface IMemory {
  sabp: SharedArrayBufferPool;
  signalBufferIndex: number;
  objectsBufferIndex: number;
}

let mem: IMemory = {} as IMemory;
const gl: WebGL2RenderingContext = offscreenCanvas.getContext("webgl")

const loop = async () => {
  const signalBuffer = mem.sabp.get(mem.signalBufferIndex);
  const signal = new Uint8Array(signalBuffer as SharedArrayBuffer)

  await Atomics.waitAsync(signal, 0, 1); // Wait for a Render signal

  const objBuffer = mem.sabp.get(mem.objectsBufferIndex);
  const buffers = new Uint32Array(objBuffer);// or uint64

  for (let i = 0; i < buffers.length; i +=2 ) {
    const bufIndex = buffers.at(i);
    const geomType = buffers.at(i + 1);

    const data = new Float32Array(mem.sabp.get(bufIndex));
    if (geomType === 1) drawAllCircles(gl, data);
    if (geomType === 2) drawAllRects(gl, data);
  }

  Atomics.exchange(signal, 0, 0); // Signal 0, e.g. UI signal
  Atomics.notify(signal, 0); // Notify other

  loop()
}

self.addEventListener('message', (e) => {
  mem = e.data as IMemory;
})




// Main.ts
import MyWorker from 'workers'

const worker = new MyWorker()

const mem = new SharedArrayBufferPool();
const signalBufferIndex = mem.allocate(1);
// 4 byte (32 bit) * 2 values * 16 elements
const objectsBufferIndex = mem.allocate(4 * 2 * 16);

let circlesIndex = -1;
let rectsIndex = -1;

let shouldInsertCircles = false;
let shouldInsertRects = false;

const data = {
  sabp: mem,
  signalBufferIndex,
  objectsBufferIndex
}

worker.postMessage(data)

const loop = () => {
  if (Atomics.load(signal, 0) === 0) { // Signal 0, e.g. UI signal
    if (shouldInsertCircles && circlesIndex === -1) {
      circlesIndex = mem.allocate(12 * 10) // memory for 10 circles allocated, a buffer is available in each worker already.
      const objects = mem.get(objectsBufferIndex)

      const buf = new Uint32Array(objects);
      buf[0] = circlesIndex;
      buf[1] = 1; // e.g. circles

      shouldInsertCircles = false;
    }

    if (shouldInsertRects && rectsIndex === -1) {
      rectsIndex = mem.allocate(16 * 10); // memory for 10 rectangles allocated, a buffer is available in each worker already.

      const objects = mem.get(objectsBufferIndex)

      const buf = new Uint32Array(objects);
      buf[2] = rectsIndex;
      buf[3] = 2; // e.g. rectangles

      shouldInsertRects = false;
    }

    Atomics.exchange(signal, 0, 1); // Signal 1, e.g. Render signal
    Atomics.notify(signal, 0); // Notify other
  }
  requestAnimationFrame(loop)
}

loop()


UI1.onClick = () => {
  shouldInsertCircles = true;
}

UI2.onClick = () => {
  shouldInsertRects = true;
}

This way, the UI thread doesn't need to send a postMessage to all the workers to notify them about a new piece of memory.
So each new allocated/reallocated buffer already exists inside each worker.

This makes it easier to build real-time applications in a performant way.

Real app example:

Example

3D multithreaded application.
UI (IO also) updates through the main thread.
Graphics renders through the 2nd thread.
Physics updates through the 3rd thread.

The UI thread captures all the events and persists them in a SharedArrayBuffer (a.k.a. Event Pool) and notifies the Render thread (to start the rendering process)
The physics thread works as a time-fixed process.
The render thread wakes up on a signal and updates all the 3D nodes based on an IO buffer and renders an image.

As rendering is done in a separate thread, there will be a whole 1 / 60ms (for 60 FPS) range to render a frame.
And the UI thread will not block if the render takes too long.

And the coolest part is that all the threads have access to the memory.
So all the threads can add/remove/resize/modify buffers without synchronizations through the postMessage.
For example, UI can create an object through the creation of a new buffer.
Render/Physics threads will process that object on the next frame.

Thank you.

Hi @SergiiSharpov !

Have you looked into GitHub - tc39/proposal-resizablearraybuffer: Proposal for resizable array buffers ? This seems like it would allow that example to work without need to keep sending a new buffer over postMessage?

Hi @aclaymore,

I do not think that this proposal solves a problem.

It's about making Buffers resizable in place.
It's good, but not the point of my proposal. (I'll be happy to test it )

As for ArrayBuffer resizing, it does not help with multithreading because ArrayBuffer is not shared, after sending it through the postMessage to a worker, it becomes unavailable in a prev context.

SharedArrayBuffer is also mentioned on this page, but I do not see any contextual changes from what we have right now, except of resizing any Buffer automatically updates all related TypedArrays:

When a TypedArray is backed by a resizable buffer, its byte offset length may automatically change if the backing buffer is resized.

Also, liked this piece of the text:

// Growable SharedArrayBuffers cannot shrink because it is real scary to
// allow for shared memory.

This is true, but the user can choose.
If the user can be sure that Buffer is only used by a current thread only, then he might be able to resize it down to reduce memory.

As an example, if a Buffer represents an array of {x: float32, y: float32} then you might want to grow it on element insertion.
But if you have a buffer of thousands of elements and you null half of a Buffer, then you will probably want to clear that unused memory (but already allocated by grow).

My initial message could work together with a proposal you sent in a way that looks something like that:

  • Deleting a Buffer should set length = 0 to all related TypedArrays
  • Reallocating down/up should set length = newLength to all related TypedArrays

Overall, my idea helps to manage memory dynamically

  • Instead of having 1 huge buffer, you might have 100.000 small buffers that are in a single thread-synced container.
  • At any point in time, you are able to delete any of these buffers so memory usage will be reduced, while SharedArrayBuffers can only grow up until reach the limit provided in the constructor.
  • Each allocated Buffer has a static index in the pool, so it will never change for the buffer that makes you able to use that index as a pointer and build complex apps.

If you want memory to also be able to reduce, and not only grow you might be interested in GitHub - tc39/proposal-structs: JavaScript Structs: Fixed Layout Objects. You can set a shared struct to reference a SharedArray of more shared structs, one for each shape.

There is a document on how to test the experimental implementation in Chrome: proposal-structs/CHROMIUM-DEV-TRIAL.md at 2fac8de9f47136352c7585e0d911e3a519c6a78c ยท tc39/proposal-structs ยท GitHub and the proposal is at Stage 1, so still lots of time to give feedback.

1 Like

Thank you!
That looks interesting, I'll check it out.

1 Like