Liveness barriers and finalization

That would have been vacuous.

During normal execution, the [[WeakRefTarget]] slot is either empty or not; if it is empty, the barrier cannot revive anything that was already collected, so it has no effect. (If it somehow did, that would make weak references equivalent to strong ones.) If it’s filled, the barrier is asserting what we already know, that the [[WeakRefTarget]] slot is filled. Either way, it’s a tautology (or worse).

Plus, liveness barriers only matter for analysis passes which determine liveness. WeakRef-oblivious executions is how the specification defines which such passes are valid; liveness barriers are meant to define ‘observing the identity of the object’ under such an execution and have no effect otherwise. If a liveness barrier is supposed to be ignored by WeakRef-oblivious executions, it might as well not be there at all.


In hindsight, I can see how saying wr.deref() === this is equivalent to wr.deref() === new Object() was probably misleading and not helpful. A more realistic example is the one I described before that: that the result of wr.deref() === this can be established without directly referring to this, by way of data flow analysis and a simple application of disjunctive syllogism, and this obviates the need to keep this alive. The recurring question in my examples is: does an execution constitute ‘observing an object’s identity’ if all its effects can be obtained indirectly, without involving the object’s identity? If not, this transformation is valid. If it does, it would be good to know which step actually observes the object’s identity.

One can also consider:

const a = {}, b = {};
const w = new WeakRef(Math.random() < 0.5 ? a : b);
// because of how w has been constructed,
// we know that (w.deref() === a || w.deref() === b);
await 0;
// we know that (w.deref() === a || w.deref() === b || w.deref() === void 0);
X0: console.log(w.deref() === a || w.deref() === b);
X1: console.log(w.deref() !== void 0);
Y0: console.log(w.deref() === a);
Y1: console.log(w.deref() !== b && w.deref() !== void 0);

X0 produces the same value as X1, and Y0 the same value as Y1. Is it permissible to convert between one and the other? If not, there has to be some way in which they differ, that optimisations are bound to preserve. Right now I see two possibilities for such a thing:

  • One may place a liveness barrier in SameValue etc.; this would allow re-writing w.deref() === a || w.deref() === b at most into (%tmp = (w.deref() !== void 0), %LivenessBarrier(a), %LivenessBarrier(b), %tmp) (%LivenessBarrier(w.deref()) is a no-op, per above). In other words, the implementation can simplify the expression to a non-emptiness check, but may only collect a and b after the check is done.
  • One may forbid any data flow analysis on a WeakRef.prototype.deref across a suspension point. Engines must treat deref() after suspension like a completely black box, which may return any value at all.

Maybe a tweaked definition of liveness could also resolve this, but I am not sure I can come up with a robust one.

Sorry for saying something stupid. I guess I still don't fully understand liveness barriers.

I am not sure how or what can be clarified, but my expectation as a programmer is that wr.deref() === this cannot be optimized or substituted, unless whatever comes after itself doesn't have observable effects.

A liveness barrier (or fence) is basically a designated operation that is said to observe the identity of its operand when reached, thus keeping it live until that point, even though it produces no effect when executed. The object is guaranteed to be live during the execution of all code before such a barrier. An optimiser is obligated to preserve a liveness barrier like a side effect, and cannot remove it or move it earlier. Such a concept basically allows to reason about code reachability instead of object reachability.

I think it may be possible to define liveness in a way that preserves exactly that expectation and little more than that: namely, we say that an object’s identity is observed when, independently of obliviousness (i.e. of whether nor not its finalisers had run and WeakRefs had been cleared), it becomes an input to an operation whose effect is dependent on obliviousness.

Analysis

This should allow any optimisation to be performed, up to optimising an object out of existence… except when that would have been possible to observe by means of WeakRef and/or FinalizationRegistry. Of course, that just shifts the burden of definition onto the term ‘effect’. But with a relatively common-sense definition encompassing I/O and termination, it seems to work pretty well:

  • void wr.deref() === this does not observe identity of this, because it produces no effect (the result is ignored).
  • console.log(wr.deref() === wr.deref()) does not observe the identity of this, because all inputs are obliviousness-dependent.
  • console.log(this === this) does not observe the identity of this, because the effect is obliviousness-independent.
  • console.log(wr.deref() === this) observes the identity of this, because the effect of that operation depends on obliviousness wrt this, and the result of evaluating the RHS of strict equality is independent of obliviousness.
  • paradoxical registration: the effect of L (and arguably R2) is dependent on obliviousness wrt delia, and delia is obliviousness-independently an input to R2; as such delia should be considered observed by the combination of L and R2.

That said, I would not be very satisfied with such a definition, since it would still leave me with no way to express a liveness barrier in user code, which is what I sought starting this topic.

If you're talking about keeping weak references alive, I would think this would be sufficient:

const requireLive = Object

let value = weakRef.get()
// do things with the given weak ref
requireLive(value)

Conveniently, this could be used even in function closures - the whole point is just to keep a strong reference to it independently of the weak ref, and everything else just works.

The issue is that a sufficiently smart compiler would know that as long as Object is not re-assigned (maybe it can see that between now and when that line runs that it won’t be) then it knows that passing the value to Object is a no op and could optimise it out and collect the object.

In practice this level of sophistication is beyond current JS implementations as it would be prohibitively expensive vs the likely gain.
But when dealing with just the spec, we are in an abstract world where: if the spec allows it then technically it can happen.

A liveness barrier would be something that is formally in the spec as something that must not be optimised out, so it would be portable to every spec compliant engine, including future not-yet-implemented ones

1 Like

can you give ux-scenarios where this is desirable rather than surprising (especially for threaded, message-passing-tasks)? and if not, maybe it should be explicitly banned by the language (that closures could be collected before falling out-of-scope)?

i have a moderately-complex, real-world-example using cached/weakref'd pool of sqlite-connections for threaded-operations, which depend on this never happening below.

p.s. are there even real-world-ux-scenarios where liveness-barriers are more ergonomic to use than a closure-reference (for the purpose of preventing gc)?

#!/bin/sh

node --input-type=module --eval '
/*jslint node*/
import sqlmath from "./sqlmath.mjs";
let {
    dbExecAsync,
    dbOpenAsync
} = sqlmath;

(async function () {
    let result;

// create closure preventing gc of cached/weakrefd sqlite connection-handles

    let threadedSqliteConnectionPool = await dbOpenAsync({
        filename: "file::memory:?cache=shared",
        threadCount: 4
    });

    await dbExecAsync({
        db: threadedSqliteConnectionPool,
        sql: (`
CREATE TABLE IF NOT EXISTS mytable AS
    SELECT value
    FROM generate_series(0, 399);
        `)
    });

// execute sqlite-queries in 4 parallel-threads

    result = await Promise.all([
        0, 100, 200, 300
    ].map(async function (range) {
        return await dbExecAsync({

// we expect the closure to never collect while threaded-queries
// are being executed

            db: threadedSqliteConnectionPool,
            sql: (`
SELECT kthpercentile(value, 0.95) AS result
    FROM mytable
    WHERE ${range} <= rowid AND rowid < ${range + 100}
            `)
        });
    }));
    result.forEach(function (queryResult, ii) {
        console.error(
            `95th-percentil in range ${100 * ii}...${100 * ii + 100}`
            + ` is ${queryResult[0][0].result}`
        );
    });
}());
'

For a JIT compiler/interpreter, perhaps. For an AOT, offline-optimised implementation (à la JVM or even C), I think transformations like discussed in this thread could be much more attractive.

An explicit liveness fence would look pretty much like a function call anyway — except it would be to a function the optimiser knows not to remove, e.g. FinalizationRegistry.fence(obj). Syntactically there’s little difference between this and Object(obj), it’s just that the former would be known to work.

As for your example, it seems to mention neither WeakRef nor FinalizationRegistry, so I fail to see how it is of relevance.

sorry for not better-documenting the imported database-module in previous example ^^;;; the code for weakrefing database-handles/c-pointers are here [1], [2], [3], [4] and shown below. closure-references are created by calling function dbDeref().

  • i didn't fully read this thread when posting, and just now discovered FinalizationRegistry, which eliminates need to implement cleanup-code in c in last snippet below.

  • found bug in my code where closure-references fall-out-of-scope once sql-queries finish executing -- so possible database-handles could be collected in-between api-calls to run sql-queries. nodejs/v8 is apparently not very aggressive at gc'ing these weakrefs, as i haven't run across this bug yet in production-code.

// sqlmath.mjs

// 1. weakmap of database-handles
// https://github.com/sqlmath/sqlmath/blob/b474be3ddf6522b0d19eff72066c4d0db23aafd3/sqlmath.mjs#L61
    let dbDict = new WeakMap();


// 2. store database-handle in weakmap
// https://github.com/sqlmath/sqlmath/blob/b474be3ddf6522b0d19eff72066c4d0db23aafd3/sqlmath.mjs#L318
        dbDict.set(db, {
            busy: 0,
            connPool, // list of c-pointers / database-handles
            ii: 0,
            ptr: 0n
        });
        return db;


// 3. get raw database-handle from weakmap
// https://github.com/sqlmath/sqlmath/blob/b474be3ddf6522b0d19eff72066c4d0db23aafd3/sqlmath.mjs#L158
    function dbDeref(db) {
// this function will get private-object mapped to <db>
        let __db = dbDict.get(db);
...
        return __db;
    }
// sqlmath_base.c

// 4. FinalizationRegistry-like cleanup for database-handles in weakmap implemented in c
// https://github.com/sqlmath/sqlmath/blob/b474be3ddf6522b0d19eff72066c4d0db23aafd3/sqlmath_base.c#L1734
static napi_value __dbFinalizerCreate(
    napi_env env,
    napi_callback_info info
) {
// this function will create empty finalizer for db
...
    errcode = napi_create_external_arraybuffer(env,     // napi_env env,
        (void *) aDb,           // void* external_data,
        sizeof(int64_t),        // size_t byte_length,
        __dbFinalizer,          // napi_finalize finalize_cb,
        NULL,                   // void* finalize_hint,
        &val);                  // napi_value* result
    ASSERT_NAPI_OK(env, errcode);
    return val;
}

I see now. Yes, since you attach the finaliser to the ArrayBuffer in which you hold the pointer, there seems to a narrow window in dbCallAsync after obtaining the pointer, but before actually passing it to the FFI call in which collecting the ArrayBuffer would trigger a use-after-free. This is pretty much exactly the case I presented in my original post, except using Node-specific primitives instead of standard ECMA-262 features. (The fact that you happen to use WeakMap is less important.)

Adding a fence to correct this would look something like:

function dbDeref(db) {
	const __db = dbDict.get(db);
	assertOrThrow(__db?.connPool[0] > 0, "invalid or closed db");
	assertOrThrow(__db.busy >= 0, "invalid db.busy " + __db.busy);
	__db.ii = (__db.ii + 1) % __db.connPool.length;
	assertOrThrow(__db.ptr > 0n, "invalid or closed db");
	return [__db, __db.connPool[__db.ii]];
}

async function dbCallAsync(func, db_, argList) {
	const [db, ptr] = dbDeref(db_);
	db.busy += 1;
	try {
		return await cCall(func, [ptr[0]].concat(argList));
	} finally {
		fence(ptr);
		db.busy -= 1;
		assertOrThrow(dbPriv.busy >= 0, "invalid db.busy " + db.busy);
	}
}

Similarly in dbCloseAsync:

await Promise.all(__db.connPool.map(async function (ptr) {
	try {
		const val = ptr[0];
		ptr[0] = 0n;
		await cCall("__dbCloseAsync", [val]);
	} finally {
		fence(ptr);
	}
}));

This one may be actually unnecessary (I would expect assignment to ptr[0] to act in itself as a fence on ptr, and you don’t actually need a fence after that point), but better safe than sorry.

Now, while we don’t have a standard, portable fence, if you have access to native API, you should be able to implement it yourself, by exporting to user code a native function that takes an object argument… and does nothing with it. Since the engine (almost certainly) has no way to analyse how an object could be possibly used in native calls and must consider them black boxes, it must therefore keep the object live until the call. This is basically the same trick that Hans Boehm once recommended for Java.

May I ask if this is being worked on, or would the committee rather wait until entrenched implementation practice emerges to basically solve the problem for them in a haphazard, ad-hoc manner?

I was under the impression that program specific liveness barriers could be expressed, if we assume a reasonable interpretation of what is considered observable.

I doubt there will be much interest from delegates to spend time on further specifying what observable means, or on a general liveness barrier expression, given that the problem is more of an hypothetical future implementation that might perform so much optimizations that it breaks the observable assumption the program had. Even then, I assume the process would be a bug report filed against the implementation, which itself comes ask the committee for clarification on observability.