Does JavaScript really need stateful and advance features?

This thread is meant to be a central location to discuss a "message passing" concept that @kaizhu256 has been bringing up in a number of threads. The goal is to centralize the discussion around this topic into one location, instead of spreading it across the forms. To see a bit of past discussion, one only needs to search for "message passing" to find a number of places where it's been brought up in the past, but here's a small sampling.

If I were to try and group together some of the themes I've seen from the linked comments, I would probably do so as follows. Feel free to correct me @kaizhu256 if I misrepresent any of your ideas.

  1. Private member variables are pretty much useless in JavaScript (reference)
  2. State is pretty much useless in JavaScript (reference #1 and reference #2)
  3. Most of JavaScript code is non-reusable logic, except for small utility functions. (reference)

Feel free to discuss these items below.

1 Like

And, I'll go first :)

I know lots of people have contradicted these general themes, and for good reason too. JavaScript is used for all sorts of purposes, from thin clients, to thick clients, servers, native mobile apps, native desktop apps, games, libraries, plugins (including plugins for desktop environments, like GNOME), etc. Its usage is ubiquitous, which is why the committee needs to ensure they design features to help all of these parties.

So, let me try and contradict some of these themes with a problem I personally ran into at work. I'll try to explain it at a high level.

I work a lot on our node API layer. Whenever a request comes in, we first grab some information about the user doing the request and some session information. This information is used for a variety of reasons, like logging and authorization. Doing a fresh fetch of this data at the start of each request would be time-consuming, so we instead cache the data in-memory for a short while. Caches always add a fair amount of complexity to a system, because you now have to deal with invalidating the cache, for example, when a user logs out. To handle this situation, I built a simple event-emmitter class. It went something like this:

class EventEmitter {
  #listeners = []
  subscribe(fn) {
    this.#listeners.push(fn)
  }

  trigger(...args) {
    for (const fn of this.#listeners) {
      fn(...args)
    }
  }
}

export const logoutEvent = new EventEmitter()

Now the logout handler can simply call logoutEvent.trigger(userId), and anyone who cares about when a user logs out can react. Likewise, this caching logic that happens at the beginning of a request can listen for this logoutEvent, and correctly invalidate cache entries when the logout event fires. It would be inappropriate to have the logout handler directly load the module corresponding to the request-initializing logic, and ask it to invalidate its cache entry. That would create unnecessary coupling between two completely unrelated parts of the server. We're already headed towards a big pile of spaghetti and I'm not keen on accelerating that process.

That's just one of many concrete examples I have. Are there private properties? Yep, there's no need to be publicly exposing the listeners property, no one else needs to know who else is listening. Is there state? Yep. The listeners array will grow over time. Is there reusable logic? As a matter of fact, I have used this same EventEmitter class for other purposes now, to help keep unrelated parts of the server apart.

why not cache the session to in-memory sqlite, and periodically re-ify its memory-image to disk for persistence? i would argue for most moderately-complex stateful problems, sqlite (or wasm-sqlite in frontend-cases) is the hammer of choice rather than javascript.

1 Like

We probably will eventually store session information in Redis. But, even if we do that, we'd still need to let Redis know when it's time to invalidate a cache entry, which happens when a user logs out (and at other times). Where the request initialization logic decides to store its cache is an internal detail of the request-initialization logic, and shouldn't actually affect anything unrelated to it, which includes the way the logout logic and the event emitter were implemented.

This is also why I made my "state" case around the EventEmitter class instead of our server cache, because it's true that the ideal solution would offload server state to external, in-memory database solutions. However, I wouldn't necessarily call it wrong for smaller server software to keep around stuff in memory either, it could be over-engineering to start grabbing in-memory databases before you really need them.

Of course, it's pretty easy to come up with other scenarios that require state. Like, any game that's implemented with JavaScript. I was almost going to share a sample tic-tac-toe game I threw together, and use that for my opening argument, but decided a work experience might work better.

i typically wouldn't bother with the "reusable" class abstraction you gave, and go directly into implementing the non-reusable part -- with the expectation it'll frequent get rewritten over-and-over again every 3-6 months.

how many instances of the logout-subcription-service would you be running per process? probably just one, in which case a non-reusable global instance would do just fine.

#!/usr/bin/env node

node --eval '
/*jslint beta, node*/
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let sqliteDb1;

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

// init server and sqlite-datastore
(async function () {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    sqliteDb1.exec(`
CREATE TABLE IF NOT EXISTS cached_session (
    sessionid TEXT PRIMARY KEY NOT NULL,
    userid INTEGER NOT NULL
);
CREATE TABLE IF NOT EXISTS server_log (
    timestamp TEXT NOT NULL,
    message TEXT NOT NULL
);
-- test session
INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);

    // subscribe listener to log user-logout
    globalLogoutListenerList1.push(function (req) {
        req.urlParsed.search.replace((
            /[?&]sessionid=([^&]*)/
        ), async function (ignore, sessionid) {
            let data = await new Promise(function (resolve) {
                sqliteDb1.all(`
                     SELECT * FROM cached_session WHERE sessionid = ?
                `, [
                    sessionid
                ], function (err, data) {
                    if (err) {
                        throw err;
                    }
                    resolve(data);
                });
            });
            if (!data) {
                return "";
            }
            data = {
                message: `userid ${data[0].userid} logout`,
                timestamp: new Date().toISOString()
            };
            console.log("inserting to sqlite-log: ", data);
            sqliteDb1.all(`
                 INSERT INTO server_log VALUES(?, ?);
            `, [
                data.timestamp, data.message
            ]);
        });
    });

    // subscribe listener to invalidate sqlite-cache on logout
    globalLogoutListenerList1.push(function (req) {
        req.urlParsed.search.replace((
            /[?&]sessionid=([^&]*)/
        ), function (ignore, sessionid) {
            console.log(`invalidating sessionid ${sessionid}`);
            sqliteDb1.run(`
                DELETE FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ]);
        });
    });

    await new Promise(function (resolve) {
        require("http").createServer(function (req, res) {
            req.urlParsed = require("url").parse(req.url);
            if (req.urlParsed.pathname === "/logout") {
                middlewareLogoutHandler1(req, res);
            }
            res.end();
        }).listen(8080, resolve);
    });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    require("http").get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

// stdout:
/*
server listening on port 8080
invalidating sessionid session1
inserting to sqlite-log:  {
  message: 'userid user1 logout',
  timestamp: '2021-11-12T07:00:50.731Z'
}
*/
'

May I ask you this: You normally talk about how JavaScript is just a glue language, and we shouldn't try to do anything too complicated with it.

So, could you give me an example where, in another language, you would find it acceptable to write reuseable code? Or code with state? Or code with private properties? When's a time that you have done it?

(I'll also get to responding to your code snippet in a momennt)

So, could you give me an example where, in another language, you would find it acceptable to write:

  1. code with private properties?
  • i've come across zero use-case warranting private-properties in other languages (note i don't do much crypto/security programming however).
  • in c, i use static keyword to prevent name-collision of globally-scoped static-functions
  1. reuseable code?
  • in c i've implemented reusable base64encoder and other serializers, which are unnecessary in javascript
  • in c# i've implemented reusable dynamic dictionaries to parse incoming dynamic, json-payloads from the web, but again, its unnecessary in javascript
  1. Or code with state?
  • simple dictionaries for storing json-config from filesystems
  • simple dictionaries for caching oauth-tokens / cookies for reuse in http-requests
  • undo-history in text-editors (yes this, is an exception that's moderately complex).
  • for most other things, its either sqlite / wasm-sqlite for moderate state-complexity or a full-blown database for heavylifting.

a recurring theme i've encountered for userland-reusable-code in other languages is they commonly deal with serializing / deserializing message-passed web-data -- features already baked into javascript over its 20+ year evolution in commercial web-space.

You seem to misunderstand the purpose of private properties. It has nothing to do with crypto or security, in fact, many languages that offer private properties also offer some form of reflection, making private properties unsuitable for security purposes. I'll talk more about the value of encapsulation further down.

It sounds like you were simply implementing features you needed which were missing from the standard library. JavaScript happens to come with support for JSON and base64, but there's plenty of features that don't ship with JavaScript, like, an ini parser. We store some of our configuration in ini files, and so we've installed a reusable library to help us parse these config files (I prefer installing libraries over hand-rolling my own parsers).

These all sound like very common tasks that would be performed within node as well. And undo-history is something that webapps commonly need. The common theme I'm seeing between all of these is that they're not tasks that are somehow excluded from JavaScript, people using JavaScript need to do tasks like these all of the time as well - it is, after all, a general-purpose language that's used all sorts of reasons.

C#, as a language is often used for creating server software, but it's also sometimes used for other purposes such as UIs. These use cases all overlap with how JavaScript gets used. And, it sounds like you've used it for server-side development. So, my next questions: if C# shipped with good, native support for serializing and deserializing data, would it just become a "message passing" language as well? You only use C# to message pass between clients and databases? What makes node so special that it's only used for "message passing" while other server-side language are not? Are they not both used for the same reasons?

Man, you're working on a fast-changing product. We've got code that's many years old all over the place. And, I expect a good portion of the new code that I write to eventually age the same way. Sure there will be parts that will be frequently updated, but other parts will simply get old, passed on to other developers, and left for them to figure out for themselves and understand when it finally comes time for them to update it.

The logout event isn't the only event emitter I've created. I've actually reused the same class to create a login event as well later on, that's used for an entirely different purpose. And, I'll likely use it for more reasons in the future. If I ever need to add an eventEmmitter in the future that has different requirements, or if an existing one needs new features, I can just make a different EventEmmitter class for it, so I'm not too worried about changing requirements (I don't publicly export the EventEmitter class anyways, I "encapsulate" the logic so it's easy to change implementation details).

Also, because the eventEmitter logic doesn't apply to either the logout code or the request-initialization logic, I put it in its own separate location, an "events.js" file.

The entire contents of the file would pretty much be the example code snippet I posted before, but with a login event as well.

class EventEmitter {
  #listeners = []
  subscribe(fn) {
    this.#listeners.push(fn)
  }

  trigger(...args) {
    for (const fn of this.#listeners) {
      fn(...args)
    }
  }
}

export const logoutEvent = new EventEmitter()
export const loginEvent = new EventEmitter()

It sounds like you think it's better to write it like this?

export const logoutEventListeners = []

export const logoutEvent = {
  subscribe(fn) {
    logoutEventListeners.push(fn)
  },
  trigger(...args) {
    for (const fn of logoutEventListeners) {
      fn(...args)
    }
  }
}

export const loginEventListeners = []

export const loginEvent = {
  subscribe(fn) {
    loginEventListeners.push(fn)
  },
  trigger(...args) {
    for (const fn of loginEventListeners) {
      fn(...args)
    }
  }
}

Why is this better? Now people have to read twice as much stuff just to realize it's the exact same thing going on. Or three or four times as more event emitter's get added.

There's a lot of code in the server, so it's important to organize it in ways that make it easy for people to jump around and find what they're looking for. I really don't want to force people to read the details of how I implemented the event emitter logic. If it's written sprawled out, as I did in the second example, then it would be very easy to expect that special treatment is happening to the event emitter that's logout specific, or login specific. If it's written as a reusable, general-purpose class, then a reader can just expect it to behave as a general-purpose class, and they shouldn't feel the need to look into the details of how it was written, in order to find what they were looking for. For this reason, I would write this EventEmitter logic as reusable-ready, even if I only use it once, because it takes just as much effort to do so, it creates just as much code, and it shows that there's nothing magical going on - it's just a general-purpose utility.

also, the second example is exporting the arrays of event listeners, but is there any reason to do this? This is harkening back to the topic of encapsulation (public/private data), which I promised I would come back to. Why should I publicly expose the internal data of this event emitter? "Just in case" someone needs to access it? If we follow this philosophy, and publicly expose everything "just in case", then that also means we must feel free to reach across the codebase and access any data we want to, after all, this philosophy does seem to assert that it must not be wrong for anyone to grab or modify this data if there's a case to do so. This, is the very definition of spaghetti code. All code in the entire codebase can freely access all data and logic contained within the codebase, free of restriction. I wouldn't ever want to work on a moderately-sized codebase like that - I would be afraid to change anything! Because I don't know who might be using it - any piece of code across the entire codebase could be referencing the logic I need to update, and I would have no clue! This is why encapsulation is a thing. If we can't think of any logical reason to let others depend on a piece of data, or on an internal utility function, etc, then all we have to do is mark it as private. Then, when requirements change, we'll find refactoring to be much easier, because we can safely know who's using what. In other words, placing restrictions on your code liberates you to refactor it more freely, because you can trust in those restrictions. And, if someone really needs access to that private member, all they have to do is switch it to public, no big deal. They really shouldn't, because there really is no reason to need access to it directly, but nothing is stopping anyone from turning individual properties public when it's needed.

Of course, this is a moving slider. Too much encapsulation and organization makes the codebase rigid, and difficult to change, because you have to change how a bunch of stuff was organized. But, on the flip side, too little organization and encapsulation also makes the codebase difficult to read or change, but for a different reason. Now you have no guarantee in the safety of the changes you are making, unless you authored the entire codebase (and thus have a relatively good mental image of how it all works).

Now, I said I was going to address your example code snippet, and I haven't forgotten to do that. The code snippet you shared actually suffers a lot from being difficult to scan. Someone looking at it with fresh eyes has to read almost every line of code to figure out what is going on. The only signposts that exist are the explicit comments you laid out. If we put just a little bit of effort into organizing some of this logic, you'll find that it's much easier to scan over the code and find what you're looking for. So, if you don't mind, I'm going to do a little peer-reviewing on this piece of code.

Let's start with this chunk:

// init server and sqlite-datastore
(async function () {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    sqliteDb1.exec(`
CREATE TABLE IF NOT EXISTS cached_session (
    sessionid TEXT PRIMARY KEY NOT NULL,
    userid INTEGER NOT NULL
);
CREATE TABLE IF NOT EXISTS server_log (
    timestamp TEXT NOT NULL,
    message TEXT NOT NULL
);
-- test session
INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);

    ...
}());

I'm going to make a very simple change. I'm going to take out your comment explaining what's going on, and instead just add a function, who's name explains precisely what the snippet does. The end result is about just as much code, but now we're explicitly grouped together related logic and we've decluttered the long IIFE a bit.

function createInitialTables() {
    sqliteDb1.exec(`
        CREATE TABLE IF NOT EXISTS cached_session (
            sessionid TEXT PRIMARY KEY NOT NULL,
            userid INTEGER NOT NULL
        );
        CREATE TABLE IF NOT EXISTS server_log (
            timestamp TEXT NOT NULL,
            message TEXT NOT NULL
        );
        -- test session
        INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);
}

(async function init () {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    createInitialTables();

    ...
}());

Is this any worse than before? It shouldn't be. All we did was move it to a function. Is it any better? Let's rinse and repeat along the whole module and find out.

full example
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let http = require("http");
let sqliteDb1;

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

function createAndPopulateInitialTables(sqliteDb) {
    sqliteDb.exec(`
        CREATE TABLE IF NOT EXISTS cached_session (
            sessionid TEXT PRIMARY KEY NOT NULL,
            userid INTEGER NOT NULL
        );
        CREATE TABLE IF NOT EXISTS server_log (
            timestamp TEXT NOT NULL,
            message TEXT NOT NULL
        );
        -- test session
        INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);
}

function logLogoutEvent(req) {
    req.urlParsed.search.replace((
        /[?&]sessionid=([^&]*)/
    ), async function (ignore, sessionid) {
        let data = await new Promise(function (resolve) {
            sqliteDb1.all(`
                SELECT * FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ], function (err, data) {
                if (err) {
                    throw err;
                }
                resolve(data);
            });
        });
        if (!data) {
            return "";
        }
        data = {
            message: `userid ${data[0].userid} logout`,
            timestamp: new Date().toISOString()
        };
        console.log("inserting to sqlite-log: ", data);
        sqliteDb1.all(`
            INSERT INTO server_log VALUES(?, ?);
        `, [
            data.timestamp, data.message
        ]);
    });
}

function invalidateSqlCacheOnLogout(req) {
    req.urlParsed.search.replace((
        /[?&]sessionid=([^&]*)/
    ), function (ignore, sessionid) {
        console.log(`invalidating sessionid ${sessionid}`);
        sqliteDb1.run(`
            DELETE FROM cached_session WHERE sessionid = ?
        `, [
            sessionid
        ]);
    });
}

function handleRequests(req, res) {
    req.urlParsed = require("url").parse(req.url);
    if (req.urlParsed.pathname === "/logout") {
        middlewareLogoutHandler1(req, res);
    }
    res.end();
}

async function startHttpServer(requestHandler, { port }) {
    await new Promise(resolve => {
        http.createServer(requestHandler).listen(port, resolve);
    });
}

(async function() {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    createAndPopulateInitialTables(sqliteDb1);

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

Pay attention to how much easier it is to look over this module and figure out what is going on. When I read the original source code, I pretty much had to read every single line to figure out what was happening. Now, you can quickly glance over the first part, noting that it's just defining a bunch of functions, then you can see the small IIFE and quickly see an overview of everything going on.

(async function() {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    createAndPopulateInitialTables(sqliteDb1);

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

Lets see - it created a database instance, initializes some tables, adds some logout event listeners, starts an https server, sends a test request, then exits. That's what this program does. Now, if I want to know more about what happens when the cache gets invalidated, I just need to dig into the invalidateSqlCacheOnLogout function. I don't need to understand the entire program.

Before I go further with this review, I want to check if we're good thus far. Have I done anything that just ruins the quality of this code in your eyes? Is it bad to follow conventional wisdom and organize the code into separate functions like this? I plan to take some further baby steps with this piece of code afterwards, and keep pushing it, step by step, until I figure out what steps in the refactoring process you're ok with, and what steps you're uncomfortable with, and why.