Does JavaScript really need stateful and advance features?

This thread is meant to be a central location to discuss a "message passing" concept that @kaizhu256 has been bringing up in a number of threads. The goal is to centralize the discussion around this topic into one location, instead of spreading it across the forms. To see a bit of past discussion, one only needs to search for "message passing" to find a number of places where it's been brought up in the past, but here's a small sampling.

If I were to try and group together some of the themes I've seen from the linked comments, I would probably do so as follows. Feel free to correct me @kaizhu256 if I misrepresent any of your ideas.

  1. Private member variables are pretty much useless in JavaScript (reference)
  2. State is pretty much useless in JavaScript (reference #1 and reference #2)
  3. Most of JavaScript code is non-reusable logic, except for small utility functions. (reference)

Feel free to discuss these items below.

1 Like

And, I'll go first :)

I know lots of people have contradicted these general themes, and for good reason too. JavaScript is used for all sorts of purposes, from thin clients, to thick clients, servers, native mobile apps, native desktop apps, games, libraries, plugins (including plugins for desktop environments, like GNOME), etc. Its usage is ubiquitous, which is why the committee needs to ensure they design features to help all of these parties.

So, let me try and contradict some of these themes with a problem I personally ran into at work. I'll try to explain it at a high level.

I work a lot on our node API layer. Whenever a request comes in, we first grab some information about the user doing the request and some session information. This information is used for a variety of reasons, like logging and authorization. Doing a fresh fetch of this data at the start of each request would be time-consuming, so we instead cache the data in-memory for a short while. Caches always add a fair amount of complexity to a system, because you now have to deal with invalidating the cache, for example, when a user logs out. To handle this situation, I built a simple event-emmitter class. It went something like this:

class EventEmitter {
  #listeners = []
  subscribe(fn) {
    this.#listeners.push(fn)
  }

  trigger(...args) {
    for (const fn of this.#listeners) {
      fn(...args)
    }
  }
}

export const logoutEvent = new EventEmitter()

Now the logout handler can simply call logoutEvent.trigger(userId), and anyone who cares about when a user logs out can react. Likewise, this caching logic that happens at the beginning of a request can listen for this logoutEvent, and correctly invalidate cache entries when the logout event fires. It would be inappropriate to have the logout handler directly load the module corresponding to the request-initializing logic, and ask it to invalidate its cache entry. That would create unnecessary coupling between two completely unrelated parts of the server. We're already headed towards a big pile of spaghetti and I'm not keen on accelerating that process.

That's just one of many concrete examples I have. Are there private properties? Yep, there's no need to be publicly exposing the listeners property, no one else needs to know who else is listening. Is there state? Yep. The listeners array will grow over time. Is there reusable logic? As a matter of fact, I have used this same EventEmitter class for other purposes now, to help keep unrelated parts of the server apart.

why not cache the session to in-memory sqlite, and periodically re-ify its memory-image to disk for persistence? i would argue for most moderately-complex stateful problems, sqlite (or wasm-sqlite in frontend-cases) is the hammer of choice rather than javascript.

1 Like

We probably will eventually store session information in Redis. But, even if we do that, we'd still need to let Redis know when it's time to invalidate a cache entry, which happens when a user logs out (and at other times). Where the request initialization logic decides to store its cache is an internal detail of the request-initialization logic, and shouldn't actually affect anything unrelated to it, which includes the way the logout logic and the event emitter were implemented.

This is also why I made my "state" case around the EventEmitter class instead of our server cache, because it's true that the ideal solution would offload server state to external, in-memory database solutions. However, I wouldn't necessarily call it wrong for smaller server software to keep around stuff in memory either, it could be over-engineering to start grabbing in-memory databases before you really need them.

Of course, it's pretty easy to come up with other scenarios that require state. Like, any game that's implemented with JavaScript. I was almost going to share a sample tic-tac-toe game I threw together, and use that for my opening argument, but decided a work experience might work better.

i typically wouldn't bother with the "reusable" class abstraction you gave, and go directly into implementing the non-reusable part -- with the expectation it'll frequent get rewritten over-and-over again every 3-6 months.

how many instances of the logout-subcription-service would you be running per process? probably just one, in which case a non-reusable global instance would do just fine.

#!/usr/bin/env node

node --eval '
/*jslint beta, node*/
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let sqliteDb1;

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

// init server and sqlite-datastore
(async function () {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    sqliteDb1.exec(`
CREATE TABLE IF NOT EXISTS cached_session (
    sessionid TEXT PRIMARY KEY NOT NULL,
    userid INTEGER NOT NULL
);
CREATE TABLE IF NOT EXISTS server_log (
    timestamp TEXT NOT NULL,
    message TEXT NOT NULL
);
-- test session
INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);

    // subscribe listener to log user-logout
    globalLogoutListenerList1.push(function (req) {
        req.urlParsed.search.replace((
            /[?&]sessionid=([^&]*)/
        ), async function (ignore, sessionid) {
            let data = await new Promise(function (resolve) {
                sqliteDb1.all(`
                     SELECT * FROM cached_session WHERE sessionid = ?
                `, [
                    sessionid
                ], function (err, data) {
                    if (err) {
                        throw err;
                    }
                    resolve(data);
                });
            });
            if (!data) {
                return "";
            }
            data = {
                message: `userid ${data[0].userid} logout`,
                timestamp: new Date().toISOString()
            };
            console.log("inserting to sqlite-log: ", data);
            sqliteDb1.all(`
                 INSERT INTO server_log VALUES(?, ?);
            `, [
                data.timestamp, data.message
            ]);
        });
    });

    // subscribe listener to invalidate sqlite-cache on logout
    globalLogoutListenerList1.push(function (req) {
        req.urlParsed.search.replace((
            /[?&]sessionid=([^&]*)/
        ), function (ignore, sessionid) {
            console.log(`invalidating sessionid ${sessionid}`);
            sqliteDb1.run(`
                DELETE FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ]);
        });
    });

    await new Promise(function (resolve) {
        require("http").createServer(function (req, res) {
            req.urlParsed = require("url").parse(req.url);
            if (req.urlParsed.pathname === "/logout") {
                middlewareLogoutHandler1(req, res);
            }
            res.end();
        }).listen(8080, resolve);
    });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    require("http").get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

// stdout:
/*
server listening on port 8080
invalidating sessionid session1
inserting to sqlite-log:  {
  message: 'userid user1 logout',
  timestamp: '2021-11-12T07:00:50.731Z'
}
*/
'

May I ask you this: You normally talk about how JavaScript is just a glue language, and we shouldn't try to do anything too complicated with it.

So, could you give me an example where, in another language, you would find it acceptable to write reuseable code? Or code with state? Or code with private properties? When's a time that you have done it?

(I'll also get to responding to your code snippet in a momennt)

So, could you give me an example where, in another language, you would find it acceptable to write:

  1. code with private properties?
  • i've come across zero use-case warranting private-properties in other languages (note i don't do much crypto/security programming however).
  • in c, i use static keyword to prevent name-collision of globally-scoped static-functions
  1. reuseable code?
  • in c i've implemented reusable base64encoder and other serializers, which are unnecessary in javascript
  • in c# i've implemented reusable dynamic dictionaries to parse incoming dynamic, json-payloads from the web, but again, its unnecessary in javascript
  1. Or code with state?
  • simple dictionaries for storing json-config from filesystems
  • simple dictionaries for caching oauth-tokens / cookies for reuse in http-requests
  • undo-history in text-editors (yes this, is an exception that's moderately complex).
  • for most other things, its either sqlite / wasm-sqlite for moderate state-complexity or a full-blown database for heavylifting.

a recurring theme i've encountered for userland-reusable-code in other languages is they commonly deal with serializing / deserializing message-passed web-data -- features already baked into javascript over its 20+ year evolution in commercial web-space.

You seem to misunderstand the purpose of private properties. It has nothing to do with crypto or security, in fact, many languages that offer private properties also offer some form of reflection, making private properties unsuitable for security purposes. I'll talk more about the value of encapsulation further down.

It sounds like you were simply implementing features you needed which were missing from the standard library. JavaScript happens to come with support for JSON and base64, but there's plenty of features that don't ship with JavaScript, like, an ini parser. We store some of our configuration in ini files, and so we've installed a reusable library to help us parse these config files (I prefer installing libraries over hand-rolling my own parsers).

These all sound like very common tasks that would be performed within node as well. And undo-history is something that webapps commonly need. The common theme I'm seeing between all of these is that they're not tasks that are somehow excluded from JavaScript, people using JavaScript need to do tasks like these all of the time as well - it is, after all, a general-purpose language that's used all sorts of reasons.

C#, as a language is often used for creating server software, but it's also sometimes used for other purposes such as UIs. These use cases all overlap with how JavaScript gets used. And, it sounds like you've used it for server-side development. So, my next questions: if C# shipped with good, native support for serializing and deserializing data, would it just become a "message passing" language as well? You only use C# to message pass between clients and databases? What makes node so special that it's only used for "message passing" while other server-side language are not? Are they not both used for the same reasons?

Man, you're working on a fast-changing product. We've got code that's many years old all over the place. And, I expect a good portion of the new code that I write to eventually age the same way. Sure there will be parts that will be frequently updated, but other parts will simply get old, passed on to other developers, and left for them to figure out for themselves and understand when it finally comes time for them to update it.

The logout event isn't the only event emitter I've created. I've actually reused the same class to create a login event as well later on, that's used for an entirely different purpose. And, I'll likely use it for more reasons in the future. If I ever need to add an eventEmmitter in the future that has different requirements, or if an existing one needs new features, I can just make a different EventEmmitter class for it, so I'm not too worried about changing requirements (I don't publicly export the EventEmitter class anyways, I "encapsulate" the logic so it's easy to change implementation details).

Also, because the eventEmitter logic doesn't apply to either the logout code or the request-initialization logic, I put it in its own separate location, an "events.js" file.

The entire contents of the file would pretty much be the example code snippet I posted before, but with a login event as well.

class EventEmitter {
  #listeners = []
  subscribe(fn) {
    this.#listeners.push(fn)
  }

  trigger(...args) {
    for (const fn of this.#listeners) {
      fn(...args)
    }
  }
}

export const logoutEvent = new EventEmitter()
export const loginEvent = new EventEmitter()

It sounds like you think it's better to write it like this?

export const logoutEventListeners = []

export const logoutEvent = {
  subscribe(fn) {
    logoutEventListeners.push(fn)
  },
  trigger(...args) {
    for (const fn of logoutEventListeners) {
      fn(...args)
    }
  }
}

export const loginEventListeners = []

export const loginEvent = {
  subscribe(fn) {
    loginEventListeners.push(fn)
  },
  trigger(...args) {
    for (const fn of loginEventListeners) {
      fn(...args)
    }
  }
}

Why is this better? Now people have to read twice as much stuff just to realize it's the exact same thing going on. Or three or four times as more event emitter's get added.

There's a lot of code in the server, so it's important to organize it in ways that make it easy for people to jump around and find what they're looking for. I really don't want to force people to read the details of how I implemented the event emitter logic. If it's written sprawled out, as I did in the second example, then it would be very easy to expect that special treatment is happening to the event emitter that's logout specific, or login specific. If it's written as a reusable, general-purpose class, then a reader can just expect it to behave as a general-purpose class, and they shouldn't feel the need to look into the details of how it was written, in order to find what they were looking for. For this reason, I would write this EventEmitter logic as reusable-ready, even if I only use it once, because it takes just as much effort to do so, it creates just as much code, and it shows that there's nothing magical going on - it's just a general-purpose utility.

also, the second example is exporting the arrays of event listeners, but is there any reason to do this? This is harkening back to the topic of encapsulation (public/private data), which I promised I would come back to. Why should I publicly expose the internal data of this event emitter? "Just in case" someone needs to access it? If we follow this philosophy, and publicly expose everything "just in case", then that also means we must feel free to reach across the codebase and access any data we want to, after all, this philosophy does seem to assert that it must not be wrong for anyone to grab or modify this data if there's a case to do so. This, is the very definition of spaghetti code. All code in the entire codebase can freely access all data and logic contained within the codebase, free of restriction. I wouldn't ever want to work on a moderately-sized codebase like that - I would be afraid to change anything! Because I don't know who might be using it - any piece of code across the entire codebase could be referencing the logic I need to update, and I would have no clue! This is why encapsulation is a thing. If we can't think of any logical reason to let others depend on a piece of data, or on an internal utility function, etc, then all we have to do is mark it as private. Then, when requirements change, we'll find refactoring to be much easier, because we can safely know who's using what. In other words, placing restrictions on your code liberates you to refactor it more freely, because you can trust in those restrictions. And, if someone really needs access to that private member, all they have to do is switch it to public, no big deal. They really shouldn't, because there really is no reason to need access to it directly, but nothing is stopping anyone from turning individual properties public when it's needed.

Of course, this is a moving slider. Too much encapsulation and organization makes the codebase rigid, and difficult to change, because you have to change how a bunch of stuff was organized. But, on the flip side, too little organization and encapsulation also makes the codebase difficult to read or change, but for a different reason. Now you have no guarantee in the safety of the changes you are making, unless you authored the entire codebase (and thus have a relatively good mental image of how it all works).

Now, I said I was going to address your example code snippet, and I haven't forgotten to do that. The code snippet you shared actually suffers a lot from being difficult to scan. Someone looking at it with fresh eyes has to read almost every line of code to figure out what is going on. The only signposts that exist are the explicit comments you laid out. If we put just a little bit of effort into organizing some of this logic, you'll find that it's much easier to scan over the code and find what you're looking for. So, if you don't mind, I'm going to do a little peer-reviewing on this piece of code.

Let's start with this chunk:

// init server and sqlite-datastore
(async function () {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    sqliteDb1.exec(`
CREATE TABLE IF NOT EXISTS cached_session (
    sessionid TEXT PRIMARY KEY NOT NULL,
    userid INTEGER NOT NULL
);
CREATE TABLE IF NOT EXISTS server_log (
    timestamp TEXT NOT NULL,
    message TEXT NOT NULL
);
-- test session
INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);

    ...
}());

I'm going to make a very simple change. I'm going to take out your comment explaining what's going on, and instead just add a function, who's name explains precisely what the snippet does. The end result is about just as much code, but now we're explicitly grouped together related logic and we've decluttered the long IIFE a bit.

function createInitialTables() {
    sqliteDb1.exec(`
        CREATE TABLE IF NOT EXISTS cached_session (
            sessionid TEXT PRIMARY KEY NOT NULL,
            userid INTEGER NOT NULL
        );
        CREATE TABLE IF NOT EXISTS server_log (
            timestamp TEXT NOT NULL,
            message TEXT NOT NULL
        );
        -- test session
        INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);
}

(async function init () {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    createInitialTables();

    ...
}());

Is this any worse than before? It shouldn't be. All we did was move it to a function. Is it any better? Let's rinse and repeat along the whole module and find out.

full example
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let http = require("http");
let sqliteDb1;

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

function createAndPopulateInitialTables(sqliteDb) {
    sqliteDb.exec(`
        CREATE TABLE IF NOT EXISTS cached_session (
            sessionid TEXT PRIMARY KEY NOT NULL,
            userid INTEGER NOT NULL
        );
        CREATE TABLE IF NOT EXISTS server_log (
            timestamp TEXT NOT NULL,
            message TEXT NOT NULL
        );
        -- test session
        INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `);
}

function logLogoutEvent(req) {
    req.urlParsed.search.replace((
        /[?&]sessionid=([^&]*)/
    ), async function (ignore, sessionid) {
        let data = await new Promise(function (resolve) {
            sqliteDb1.all(`
                SELECT * FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ], function (err, data) {
                if (err) {
                    throw err;
                }
                resolve(data);
            });
        });
        if (!data) {
            return "";
        }
        data = {
            message: `userid ${data[0].userid} logout`,
            timestamp: new Date().toISOString()
        };
        console.log("inserting to sqlite-log: ", data);
        sqliteDb1.all(`
            INSERT INTO server_log VALUES(?, ?);
        `, [
            data.timestamp, data.message
        ]);
    });
}

function invalidateSqlCacheOnLogout(req) {
    req.urlParsed.search.replace((
        /[?&]sessionid=([^&]*)/
    ), function (ignore, sessionid) {
        console.log(`invalidating sessionid ${sessionid}`);
        sqliteDb1.run(`
            DELETE FROM cached_session WHERE sessionid = ?
        `, [
            sessionid
        ]);
    });
}

function handleRequests(req, res) {
    req.urlParsed = require("url").parse(req.url);
    if (req.urlParsed.pathname === "/logout") {
        middlewareLogoutHandler1(req, res);
    }
    res.end();
}

async function startHttpServer(requestHandler, { port }) {
    await new Promise(resolve => {
        http.createServer(requestHandler).listen(port, resolve);
    });
}

(async function() {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    createAndPopulateInitialTables(sqliteDb1);

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

Pay attention to how much easier it is to look over this module and figure out what is going on. When I read the original source code, I pretty much had to read every single line to figure out what was happening. Now, you can quickly glance over the first part, noting that it's just defining a bunch of functions, then you can see the small IIFE and quickly see an overview of everything going on.

(async function() {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    createAndPopulateInitialTables(sqliteDb1);

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

Lets see - it created a database instance, initializes some tables, adds some logout event listeners, starts an https server, sends a test request, then exits. That's what this program does. Now, if I want to know more about what happens when the cache gets invalidated, I just need to dig into the invalidateSqlCacheOnLogout function. I don't need to understand the entire program.

Before I go further with this review, I want to check if we're good thus far. Have I done anything that just ruins the quality of this code in your eyes? Is it bad to follow conventional wisdom and organize the code into separate functions like this? I plan to take some further baby steps with this piece of code afterwards, and keep pushing it, step by step, until I figure out what steps in the refactoring process you're ok with, and what steps you're uncomfortable with, and why.

Well, I had spent a while going over that code example, and trying to figure out what I didn't like about it and why, and wrote up this whole review on it that expressed my thoughts, just to find out that I ended up writing a really long monster, which is why I simply started with that first half. But, as I already have a second half written, and I don't want that work to go to waste, I'll go ahead and post the rest of my thoughts.

Keep in mind that I'm totally cool if you don't agree with these thoughts or my conclusion, but it should hopefully at least express my point of view on the matter. I'm also able to be swayed in my stance given a sufficiently convincing argument - someone has recently convinced me that it's ok (and maybe even encouraged) to write longer functions, as long as you follow certain techniques to organize the logic within the function - looking at what I had written with fresh eyes and the perspective this person gave me, I see that that have I probably subdivided your code snippet into too many functions, but I'll just leave the review the way I had written it. In the end, I think it just comes down to how flexible we want our codebases to be, and as we both work in very different settings, we're going to have very different needs related to flexibility (a point that I addressed near the end).

So, here's the second half of my review on your code snippet.



Each of those function definitions are doing a lot of low-level database logic, which makes it difficult to read any particular function. You have to read the entire thing, step by step, to understand what's going on. Perhaps we can make this code easier to read over and understand at a glance with just a little bit of refactoring. Let's move all of the low-level database accessing logic together, into its own place.

I'm also going to toss your regex logic. You seem to be abusing a replacer function parameter in a pretty unconventional and unnecessary way. I'll just simplify that to use URLSearchParams() to parse the query parameters. This change is mostly for me, to help me work with your code, but I wanted to explicitly call out why I did it.

Full code
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let http = require("http");
let sqliteDb1;

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

const databaseActions = {
    initializeTables: sqlliteDb => sqlliteDb.exec(`
        CREATE TABLE IF NOT EXISTS cached_session (
            sessionid TEXT PRIMARY KEY NOT NULL,
            userid INTEGER NOT NULL
        );
        CREATE TABLE IF NOT EXISTS server_log (
            timestamp TEXT NOT NULL,
            message TEXT NOT NULL
        );
    `),
    insertDummyTestSession: sqlliteDb => sqlliteDb.exec(`
        INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `),
    async getSession(sqliteDb, sessionid) {
        return new Promise((resolve, reject) => {
            sqliteDb.all(`
                SELECT * FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ], function (err, data) {
                if (err) return reject(err);
                resolve(data);
            });
        });
    },
    addLogEntry(sqliteDb, { timestamp, message }) {
        sqliteDb.all(`
            INSERT INTO server_log VALUES(?, ?);
        `, [
            timestamp, message
        ]);
    },
    removedCachedSession(sqliteDb, sessionid) {
        sqliteDb.run(`
            DELETE FROM cached_session WHERE sessionid = ?
        `, [
            sessionid
        ]);
    },
};

async function logLogoutEvent(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    let data = await databaseActions.getSession(sqliteDb1, sessionid);
    if (!data) {
        return "";
    }
    data = {
        message: `userid ${data[0].userid} logout`,
        timestamp: new Date().toISOString()
    };
    console.log("inserting to sqlite-log: ", data);
    databaseActions.addLogEntry(sqliteDb1, data);
}

function invalidateSqlCacheOnLogout(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    console.log(`invalidating sessionid ${sessionid}`);
    databaseActions.removedCachedSession(sqliteDb1, sessionid);
}

function handleRequests(req, res) {
    req.urlParsed = require("url").parse(req.url);
    if (req.urlParsed.pathname === "/logout") {
        middlewareLogoutHandler1(req, res);
    }
    res.end();
}

async function startHttpServer(requestHandler, { port }) {
    await new Promise(resolve => {
        http.createServer(requestHandler).listen(port, resolve);
    });
}

(async function() {
    sqliteDb1 = new moduleSqlite.Database(":memory:");
    databaseActions.initializeTables(sqliteDb1);
    databaseActions.insertDummyTestSession(sqliteDb1);

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

There we go! Now we don't have a bunch of low-level database access logic cluttering our higher-level functions. Plus, the database access logic is all centralized in a single place, so if we ever need to update the shape of any table (a semi-common need), we just have to edit a single point, instead of refactoring our entire codebase.

Before:

function logLogoutEvent(req) {
    req.urlParsed.search.replace((
        /[?&]sessionid=([^&]*)/
    ), async function (ignore, sessionid) {
        let data = await new Promise(function (resolve) {
            sqliteDb1.all(`
                SELECT * FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ], function (err, data) {
                if (err) {
                    throw err;
                }
                resolve(data);
            });
        });
        if (!data) {
            return "";
        }
        data = {
            message: `userid ${data[0].userid} logout`,
            timestamp: new Date().toISOString()
        };
        console.log("inserting to sqlite-log: ", data);
        sqliteDb1.all(`
            INSERT INTO server_log VALUES(?, ?);
        `, [
            data.timestamp, data.message
        ]);
    });
}

After:

const databaseActions = {
    ...
    addLogEntry(sqliteDb, { timestamp, message }) {
        sqliteDb.all(`
            INSERT INTO server_log VALUES(?, ?);
        `, [
            timestamp, message
        ]);
    },
    ...
}

async function logLogoutEvent(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    let data = await databaseActions.getSession(sqliteDb1, sessionid);
    if (!data) {
        return "";
    }
    data = {
        message: `userid ${data[0].userid} logout`,
        timestamp: new Date().toISOString()
    };
    console.log("inserting to sqlite-log: ", data);
    databaseActions.addLogEntry(sqliteDb1, data);
}

That really cleans it up.

But, there's one more thing. I mentioned before that one of the valuable reasons for this refactor is that it allows you to look in one place to see all of the database actions. It becomes extremely easy to update the shape of a table, because you know exactly where to find all of the code that uses that table. Almost. There's actually nothing stopping a programmer from using that table directly, and there's no way to know if someone is doing direct calls to the database without searching through the whole codebase. Unless, we introduce some encapsulation, to provide some guarantees to future readers that database access only happens within that section of the codebase. There's many ways to achieve this - it could be private module-level variables that you don't export, or a factory function. It's really all the same. I'm just going to use a class for it.

Full Example
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let http = require("http");
let datastore

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

class DataStore {
    #sqliteDb;
    constructor() {
        this.#sqliteDb = new moduleSqlite.Database(":memory:");
        Object.freeze(this);
    }

    initializeTables = () => this.#sqliteDb.exec(`
        CREATE TABLE IF NOT EXISTS cached_session (
            sessionid TEXT PRIMARY KEY NOT NULL,
            userid INTEGER NOT NULL
        );
        CREATE TABLE IF NOT EXISTS server_log (
            timestamp TEXT NOT NULL,
            message TEXT NOT NULL
        );
    `)

    insertDummyTestSession = () => this.#sqliteDb.exec(`
        INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
    `)

    async getSession(sessionid) {
        return new Promise((resolve, reject) => {
            this.#sqliteDb.all(`
                SELECT * FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ], function (err, data) {
                if (err) return reject(err);
                resolve(data);
            });
        });
    }

    addLogEntry({ timestamp, message }) {
        this.#sqliteDb.all(`
            INSERT INTO server_log VALUES(?, ?);
        `, [
            timestamp, message
        ]);
    }

    removedCachedSession(sessionid) {
        this.#sqliteDb.run(`
            DELETE FROM cached_session WHERE sessionid = ?
        `, [
            sessionid
        ]);
    }
}

async function logLogoutEvent(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    let data = await datastore.getSession(sessionid);
    if (!data) {
        return "";
    }
    data = {
        message: `userid ${data[0].userid} logout`,
        timestamp: new Date().toISOString()
    };
    console.log("inserting to sqlite-log: ", data);
    datastore.addLogEntry(data);
}

function invalidateSqlCacheOnLogout(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    console.log(`invalidating sessionid ${sessionid}`);
    datastore.removedCachedSession(sessionid);
}

function handleRequests(req, res) {
    req.urlParsed = require("url").parse(req.url);
    if (req.urlParsed.pathname === "/logout") {
        middlewareLogoutHandler1(req, res);
    }
    res.end();
}

async function startHttpServer(requestHandler, { port }) {
    await new Promise(resolve => {
        http.createServer(requestHandler).listen(port, resolve);
    });
}

(async function() {
    datastore = new DataStore();
    datastore.initializeTables();
    datastore.insertDummyTestSession();

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

Perfect! Overall, we didn't really add that much bloat to the code. All we did was organize it a bit, using coding practices that are pretty standard for any language.

I get the feeling that you like to write your JavaScript code so that it has optimum flexibility, i.e. you don't like putting "arbitrary" restrictions on your code. - I hope you realize that this comes at a huge cost - it's very much not scalable. A repository that's created for maximum flexibility is one with absolutely no organization. Say you keep all of your data public, just in case someone, somewhere needs to use or edit it. And, say you keep all of your methods public for the same reason. And, if you're ok opening up permissions on your functions and data to allow anyone to access it freely, you must also be ok freely accessing functions and data from anywhere in the codebase. This, is the very definition of spaghetti code. Code that has no rules or constraints. Code that's free to access whatever from wherever it wants to. We place restrictions on our codebase in order to add predictability to it. Sure it's possible that some future developer might want to do a direct database query, and if they really want to do so, they can simply flip the private "#sqliteDb" property to public and do it. Nothing's stopping that future developer from making stuff public. However, as long as it's private, any future readers can be certain that requests to that database instance are only coming out of that single class, no where else. And it's exactly because of that kind of restriction that this sort of code becomes much more readable and maintainable. You can rely on things being true, without having to assume. You don't have to read the whole codebase in order to simply add a column to a table, you only need to read the part that matters, because you can see that the restrictions that are in place are preventing any other part of the application from touching the database instance.

Of course, this is a moving slider. Too much organization is also a bad thing. The more you organize your code, the more rigid you make it, and the more refactoring you have to do if changes come along that go against the way it's been organized. So, there's a balance. Everything discussed thus far is true for any programming language out there, it's nothing unique to JavaScript. People end up with large codebases in all of the major general purpose languages out there, and they have to figure out the right amount of organization to put into it - how much to break apart the functions, what kinds of abstractions to put in, etc.

1 Like

for this scenario, do you think javascript really needs stateful and advanced features, when from employer's perspective, you and and i were hired to write a one-off, dumb/stateless message-passing program between browser <-> sqlite?

[non-database] scalability is the exception rather than rule for most web-projects. there's still too many scalability-minded features in your program when there's no need for them.

for example, why do you need a DataStore class abstracting sqlite-queries?

  • when i'm troubleshooting an end-to-end communication issue between browser <-> server, i want to directly inspect the sql-queries run against the requests
  • preferably with both request and sql-queries self-contained in the same function so i can read it top-down, and you've added a roadblock to that process.
  • and if the storage-engine were changed to redis or such, it makes more sense to rewrite the entire middleware for redis, than to reuse a class originally designed for sqlite.

yes i probably wouldn't work well in projects with large teams (> 5 devs), or writing reusable modules, which is fine. most project work fall below that size, and most "modules" never get reused outside of initial projects they were written for.

Indeed, if your employer is only hiring JavaScript programmers for that very limited purpose, your employer wouldn't find value here. Most employers, however, hire for a vastly more complex range of JavaScript expertise, and that's who we should be optimizing for.

(I also suspect "sql-lite" isn't anywhere near as common a use case as "sql", but that's not really relevant)

2 Likes

Well...

  1. An auth-system doesn't sound "one-off", it sounds like an important piece of the product that'll stick around for a while. If you're really building a one-off auth piece that'll last a few months then be thrown away, then sure, it can be written however you want, since no ones going to look at it again. But most login systems I know of last years before they're rewritten, and often get periodically updated throughout their lifetime, as they're updated to match evolving security standards. (At least, that's how the auth system in the company I work at works - our old one is years old, and I have constantly been doing fixes, performance improvements, and touch-ups to it)
  2. I wouldn't really call functions or classes "advance", and the code I wrote there isn't that stateful, but if you find these to be overly advance, then it could be dumbed down further. I did mention that classes weren't the only way to go to achieve this. Here's a quick and dirty untested example that has zero state and no classes, it's just functions, so no advance features. But, conceptually, this really isn't that different from the earlier example, it's just build using different language constructs.
Full Example
let globalLogoutListenerList1 = [];
let moduleSqlite = require("sqlite3");
let http = require("http");

function middlewareLogoutHandler1(req, res) {
    globalLogoutListenerList1.forEach(function (listener) {
        listener(req, res);
    });
}

let datastore;
{
    const sqliteDb = new moduleSqlite.Database(":memory:");
    datastore = {
        initializeTables: () => this.#sqliteDb.exec(`
            CREATE TABLE IF NOT EXISTS cached_session (
                sessionid TEXT PRIMARY KEY NOT NULL,
                userid INTEGER NOT NULL
            );
            CREATE TABLE IF NOT EXISTS server_log (
                timestamp TEXT NOT NULL,
                message TEXT NOT NULL
            );
        `),

        insertDummyTestSession: () => this.#sqliteDb.exec(`
            INSERT INTO cached_session VALUES (\u0027session1\u0027, \u0027user1\u0027);
        `),

        getSession: async (sessionid) => {
            return new Promise((resolve, reject) => {
                this.#sqliteDb.all(`
                    SELECT * FROM cached_session WHERE sessionid = ?
                `, [
                    sessionid
                ], function (err, data) {
                    if (err) return reject(err);
                    resolve(data);
                });
            });
        },

        addLogEntry: ({ timestamp, message }) => {
            this.#sqliteDb.all(`
                INSERT INTO server_log VALUES(?, ?);
            `, [
                timestamp, message
            ]);
        },

        removedCachedSession: (sessionid) => {
            this.#sqliteDb.run(`
                DELETE FROM cached_session WHERE sessionid = ?
            `, [
                sessionid
            ]);
        },
    };
}

async function logLogoutEvent(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    let data = await datastore.getSession(sessionid);
    if (!data) {
        return "";
    }
    data = {
        message: `userid ${data[0].userid} logout`,
        timestamp: new Date().toISOString()
    };
    console.log("inserting to sqlite-log: ", data);
    datastore.addLogEntry(data);
}

function invalidateSqlCacheOnLogout(req) {
    let sessionid = new URLSearchParams(req.urlParsed.search).get("sessionid");
    console.log(`invalidating sessionid ${sessionid}`);
    datastore.removedCachedSession(sessionid);
}

function handleRequests(req, res) {
    req.urlParsed = require("url").parse(req.url);
    if (req.urlParsed.pathname === "/logout") {
        middlewareLogoutHandler1(req, res);
    }
    res.end();
}

async function startHttpServer(requestHandler, { port }) {
    await new Promise(resolve => {
        http.createServer(requestHandler).listen(port, resolve);
    });
}

(async function() {
    datastore.initializeTables();
    datastore.insertDummyTestSession();

    globalLogoutListenerList1.push(logLogoutEvent, invalidateSqlCacheOnLogout);

    await startHttpServer(handleRequests, { port: 8080 });
    console.log("\n\nserver listening on port 8080");

    // test logout api
    http.get("http://localhost:8080/logout?sessionid=session1");

    // exit process after 500 ms
    setTimeout(process.exit, 500);
}());

You're not the only one who's going to work on this piece of code. If you are, then sure, write it however you want - no one knows a chunk of code as well as the original author, so it's much easier for the original author to navigate and fix things in it. But if you work with a team of any size, or if a future engineer will eventually inherit this code, then you want to write it in a way that's easy to scan and understand, without forcing them to understand the whole entire thing. That's the point of dividing up the code this way, so when a future developer is trying to fix a bug or add a feature, they can see something like addLogEntry(), quickly know that that's not what they want, and move on until they find what they're looking for. If the entire codebase is just a bunch of low-level code, then the only way to achieve a high-level understanding of the codebase is by understanding the low-level pieces and piecing it all together yourself. (the original author, you, would already have this kind of high level understanding of the code because you wrote it, which makes it easy to forget that others coming into the codebase may have a much more difficult time getting a foothold of how it all works together).

One of your major arguments against this sort of abstraction seems to be that you lose the sequential order of how these instructions run. I can sympathize with that concern, and if you wanted to inline a couple of those non-database-related functions then I wouldn't blame you. But, I wouldn't want ever want to inline those ugly database transactions into the higher-level logic. You're criticizing the code because I've prematurely refactored it for speculative purposes that one day we may need to change how we store the data (I cited changing the schema as an example, which is a pretty common thing to do - I hope you're not wanting to rewrite the entire middleware every time you update the schema to support something new). This is a useful purpose for this sort of refactoring, but honestly, for me, it's not the primary purpose. The primary reason why I like the code split up this way is simply that it's a little more organized. It's easier to find a fork when you need it, if all you need to do is look in a utensil draw, it's much harder when the forks are simply left in the kitchen in places where the last person found it convenient to leave them. (but organization can be overdone, after all, where are you going to place a spork if you have separate drawers for forks and spoons).

Perhaps let's look at it from this perspective:

Each abstraction we introduce to our codebase has a cost. Some cost a lot more than others, and only pay off under certain circumstances that don't apply to everyone. It's always a balance to decide which abstractions to use and when. From the way you talk, you seem to think that all JavaScript programming (whether you're working on the front-end, back-end, writing reusable npm libraries, etc), all of it is one-off coding, where the author writes it once and never touches it again. People only write new JavaScript and they never modify existing JavaScript, especially code that other developers have written. If you're in an organization that operates this way, then you should use very few abstractions, because they aren't going to help your code very much, still though, there are some very cheap abstractions you can always use that cost almost nothing, for example, even if you're just throwing together a thousand lines of code at top speed, it can still be helpful to, at a minimum, organize larger chunks of your code into some functions. At the same time, many other good practices just aren't worth it if all you're doing is throwing something small together, like, there might not be a need to bother with type-safety and what-not.

So, perhaps the real question isn't "is your code 'good' or is my code 'good'", rather, it's "is your code "good" according to the purpose it was written for. Were you trying to make a login system that the company was planning on tossing in a couple of months? If so, then your code would work just fine. Were you trying to make a login system that would sit around for decades, and would need the occasional update from developers who aren't that familiar with this code? Then perhaps my solution would be a little friendlier to the developers unfamiliar with the code.

Now, the original topic of this thread was about trying to ask if JavaScript needed more advance language features. For the answer to this question to be "yes", we only need a subgroup of JavaScript developers who need these sorts of features to write stable code. If other developers are overabstracting their code when they don't need to, that's their problem. But, I do believe that it's important that we provide these language features for all of those people out there who actually need them. For those companies who have login screens that last more than a couple of months, and for those companies who write UI pages that they periodically update, and for the people who make NPM packages who don't want you abusing the internal details of the package, so they can release changes without fear of breaking your code. I think a large portion of the community has a need for at least some of these features but I have no data to back that up. You seem to think it's a lot less, which is fine. What's important is that there's a sizeable portion of the community that uses these features, whether that's 90% or 1% (1% is still a whole lot of people).

the "never touched again" part is off. i fully expect my code to be 100% rewritten in order to meet new ux-needs. rewritten so much in fact, that i don't bother making it abstractable because even the abstractable parts are expected to be rewritten over a short time-span.

javascript code that is "never touched", are typically integration-level tests permanently kept to prevent high-level ux-regressions from all the endless rewriting.

but yes, i do think all of javascript programming as one-off coding. and there will eventually come a day when software-engineering becomes a niche profession -- where lay-people figure out how to accomplish ux-tasks themselves with "one-off" scripts (that are continually rewritten) thanks to javascript.

Well, that's fine if you're coding in that sort of scenario, but I don't think most people are. When a security flaw is found in a library, I don't think they're rewriting the whole thing to fix it.

In fact, I'm going to go ahead and try and bring a little concrete evidence into this. I'm going to look at the top npm packages to find out how old their JavaScript code gets. Specifically, I'll pick random files to do a git blame, to see if it has a bunch of minor changes over the years, or giant sweeping, rewrite-style changes.

#1 is lodash, which surprisingly isn't being updated that often, however, it still does receive the occasional update, the latest of which was last April. Looking at the git history, most updates are minor clean-up items (a performance fix here, a jsdoc update there). Certainly, there's not anything that warrants an entire rewrite. Some of its source code has been recently updated, but a lot of the code hasn't been touched in years. It seems that about five years ago, they underwent a giant refactor that moved code around and reformatted it. Before that point, all of the code existed in a single, 17,000-line long file. Doing a blame on that showed all sorts of active development that spanned years, where lots of pieces of functions were being updated. Some parts were as old as 10 years, without getting touched. So, during the entire history of lodash, it seems it's gone through one major refactor (perhaps it was even just a reorganization, I didn't look closely to see if they actually changed any of the functions, or if they just moved them). It does not seem like it's been rewritten entirely from scratch.

#2 is chalk. It doesn't seem to have a ton of source code to it, but looking at the blame for the index.js file alone, I can see that it's receiving minor updates over the course of a long history. Some bits are 7 years old, other bits are only a few months. This single file shows that there's been a lot of incremental improvements to get to where it is today. Certainly not a rewrite.

#3 is the request library, which appears to now be deprecated. I could look into it to see what it holds, but for the sake of time, decided to skip it.

Now, perhaps it's unfair to simply look at the biggest packages out there, because bigger packages develop differently than smaller ones. So, I looked at NPM's "Recently updated packages" to see some examples from random packages.

The first on that was listed that had a github repo and wasn't brand new was a package by the name of tcp, which clocked in at a meer three downloads per week. Its entire 121 lines of source code fit in a single file, and from the looks of the blame, it contained updated as recent as, well, minutes ago, and as old as two years. Once again, it's a bunch of small little updates to a codebase over years, not giant, sweeping refactors.

Next up was core-for-zowe-sdk, which clocked in at 1,000 downloads per week. I viewed some random files, and indeed, I did run into some files that were created a bit ago, and barely got updated since then - these files seemed to support your theory of code only being written once, then never touched (but they were still relatively new). I also found other files that were much more active, with a number of minor modifications being made over time.

I could go on, but I think I'll stop there. Feel free to look at the git blames for any of these packages to see for yourself how often rewrites happen compared to minor updates, or if you want, you can also look at newer npm packages, or even newer JavaScript github repos (I didn't venture there, but I'm sure I could, and I'm sure I would see a lot of the same). But, I think that should be sufficient evidence to support the fact that a good portion of the JavaScript community does, in fact, need some abstraction tools to help them as they make minor modifications to their code over time.

2 Likes

I can agree that lot of code written by me commercially can't be easily reused outside of the original project, except for integration with third-party providers. But mere existence of heavily dependent upon libraries with active development proves that experience isn't universal.

1 Like

if you want to ship ux-features using software-engineering rather than one-off-coding / dynamic-scripting / hacks, you shouldn't have chosen a scripting-language in the first-place.

javascript's popularity is because it allow non-software-engineers to ship useful web-products via simple and intuitive hacks and glue-code.

making javascript less-intuitive to lay-people by adding needlessly complicated features so software-engineers can stay relevant (and employed) is not in best interest of majority of its users -- and yes, this is a biased political opinion from a layperson such as myself.

oooh, interesting. So perhaps this is the real heart of the issue. So, what's really going on is you feel the moment someone starts to have a need for anything moderately complex from the JavaScript language, then they've picked the wrong tool for the job, and they should have chosen to use a different language. Basically, in your opinion, the tools like NPM, node, the front-end frameworks like React and Angular, TypeScript, etc, all of these should not exist, because people should just keep their JavaScript code super simple and hacky, and should use other languages to solve problems that are more complex.

But, this doesn't sound to always be feasible. May I ask you what you feel you're supposed to do if you're developing a larger-scale application like Google Docs? It's a thick client that has to worry about all sorts of complex rendering and syncing logic, and I'm sure there's a lot of complexity to its underlying codebase. It's got to be every bit as complex as the software we build in any other language. What are the developers supposed to do in this case, if we went your route, and didn't offer any tools in the language to help people build large-scale applications in JavaScript? Do you believe that webapps like Google Drive just shouldn't exist, and people should instead download installable software?

Are you also against the general concept of anyone ever trying to build a thick client? Should we go back to the days when all webpages are server-side renders, and all actions you do on a webpage requires a round-trip to the server to be handled? Any moderately complex feature, like "saving what you write as you write, as a draft" should not be included in a webpage, because that's just too much complexity on the front-end.

1 Like

yes that is mostly correct, with the exception of npm and node (which made javascript more accessible to non-programmers). if you're writing something super-complicated in your javascript message-passing-program (instead of in webassembly, c/rust addon, or sql-queries), you're probably doing something wrong.

May I ask you what you feel you're supposed to do if you're developing a larger-scale application like Google Docs?

  • except i don't develop larger-scale application like google docs, and neither does majority of javascript-users
  • tc39 shouldn't focus on optimizing for this niche use-case.
  • tc39 should focus instead on making javascript and web-programming more accessible to the billions of untapped users with zero sofware-engineering knowlege (knowledge which quite frankly unnecessary in modern web-development)

Ok, so here's my thoughts around this.

Firstly, I wouldn't exactly call Google Drive a niche use case, considering the gazillion webapps I've used in the past. The majority of computer software that can be made into a webapp, has been made into one. So, even if it's a minority that develop these things, it's still a very large crowd. And these developers do need JavaScript to be able to support features that let them write scalable software.

With that in mind, there seem to be two goals going on here:

  1. Provide a feature set that's easy for newcomers to learn and use
  2. Make it so JavaScript can support the development of large, scalable projects

My question is, why do these objectives have to be mutually exclusive? Can't we provide language features for those who need them, while still letting those who don't need them continue to do what they're doing? It's not like the introduction of class syntax is forcing everyone to use classes (many people refuse to use them, and that's ok), nor does the introduction of pattern matching force everyone to use pattern-matching. For those who are still learning to use JavaScript, they can simply learn the basics and use them, without ever diving into "more advance features". Nothings stopping them from getting by with the basics.

In fact, I would go as far as to say that, if you really believe in this cause, perhaps it could be worth looking into trying to make a subset of the JavaScript language that forces developers to only use the most basic features. This could be done with a full-blown compiler, but it could also easily be done by simply creating a set of linter rules that restrict developers from using more advanced language features. Lets call it "miniScript", just to give it a name. Now, developers who only need the most basic language features can use MiniScript for all of their needs. If they have a question they want to ask on StackOverflow, they can tag it under MiniScript, so that those who are answering will know to make sure they only provide answers in syntax that's supported by MiniScript. MiniScript can become the developer-friendly, easy-to-hack language. And, if there really is a majority that program like this, then this sort of language should take off like wild-fire. All the while, TC39 can continue to provide support for both basic language features that MiniScript needs, and advance language features that web-app developers need.

2 Likes