Object.at(), Object.size()

Idea

Object.at( O, index ); //returns O's index'th entry
Object.size( O ); //returns O's count of enumerable properties

Why?

Say you want to pick a random object entry. (Or the last inserted object entry.)
You have to:

  • iterate over all the entries with for...in
  • or, allocate an entire new array with Object.entries().

This is an unnecessary problem, because JS engines already have an addressable index of an object's entries.

You might think: Just use an Array instead of an Object.
But no! Picking a random entry from an array is easy. But now what if you want to pick an entry by a specific key? Arrays are expensive to search for key-value pairs, so you end up with the same inefficiency problem. :-/

Solution!

Object.at() and Object.size(), implemented as a built-in-functions, provide efficient non-linear, blind access, and efficient key-value searching.

This would be amazing! It combines the best of Object with the best of Array. Currently, Object provides fast key-searching (when you know what you want, but not where it is), and Array provides non-linear, blind access (when you don't know what you want, but you know where it is).

Adding at() and size() would marry the best of both worlds by allowing random, indexed access into a fast-searchable key/value structure. :-)


Polyfill

Note: I've included polyfill code below, but this polyfill code is not what I am proposing! Please do not hijack this thread to point out typos or best code practices!! I am proposing a built-in function that makes the code below unnecessary.

Here are two ways these functions' polyfills could be implemented, neither of which is practical or efficient compared to a built-in function.

Polyfill via iteration:


Object.at = ( O, index ) => {
    let i = 0, entry = undefined;
    for( const key in O ) {
        if( i === index ) {
            entry = [ key, O[ key ] ];
            break;
        }
        ++ i;
    }
    return entry;
}

Object.size = ( O ) => {
    let i = 0;
    for( const key in O ) {
        ++ i;
    }
    return i;
}

Polyfill via array-allocation:


Object.at = ( O, index ) => Object.entries( O )[ index ];

Object.size = ( O ) => Object.entries( O ).length;

These functions would benefit enormously from being built-ins, since JS engines already have indexed search into object entries, only unexposed.

(I would also like to add a symbol for iterators to allow the user to implement iterator-specific at() and size() functions, but lets keep it simple here.)

Couple of questions and concerns.

  1. If you're trying to pick a random entry from an object, would it, perhaps, be better to store this data in a map, and pick a random entry from a map?
  2. Introducing indexing for the purpose of getting a random entry seems like a bad idea. Indexing implies there's an ordering that people should purposely depend on, and I can see this leading to all sorts of bad problems. Yes, the order of object entries are stable, but that ordering shouldn't be depended on. I see this opening up the floodgates for other proposals like "please let us change the index of an object entry", or "let us insert an object entry at a particular index", with everyone pointing to Object.at() as the reason for having these features.

An alternative solution I can think of, is that some languages have an OrderedMap data structure, that allow you to treat a map as an array as well. This could be used to solve this specific problem, and many more.

This is, unfortunately, a fairly large solution for a fairly small problem. Perhaps there's better options that I'm not thinking of.

1 Like

Object.size discussed here: Object.size()

2 Likes

Please excuse my all caps.

In JavaScript, there is ABSOLUTELY NO WAY (including Maps) to have both:

  • key-search, and
  • random selection of elements.

This functionality exists in every JS engine. It should be exposed.

I would wager that over 90% of games in existence require random selection of key-searchable objects. Everything from Poker cards to Asteroids power-ups. This is not a niche-feature. It is universally used, universally inefficient in JS, and easily remedied.

It could be implemented via a new Random() global object that would also solve a host of other problems. It could be implemented via Object.at() / Object.size() as I proposed here.

If you know of another implementation strategy, suggest it. I assure you, there are no "better options I'm not thinking of".

(Also, I don't mean to be rude, but are you perhaps misremembering how Maps work? There is no way to access a random element of a Map. You would have to allocate an array in memory, copy all the entries of the Map to that array, select a random element, then discard the entire array, because the Map can (and should) change. Do you know how ridiculous it is to write a game-messaging server in Node? Because there's no way to search usernames and do random match-ups without maintaining two separate, almost-identical structures in memory?)

How would you propose solving this problem, if adding anything to Object is a "non-starter"? Would Map.at() and Map.size() be more acceptable?

Maps already have a size accessor.

1 Like

Which is not really related to the topic of this thread? (Object does not have at() or size(). Map has size(), which does not help, but Map is also a much slower search algorithm, so it is also suboptimal, although I'll take what I can get.)

Please don't mis-understand. I wasn't trying to suggest that Maps solve the problem. My first point about maps being a better data structure than objects was merely meant to suggest that if we were to try and add new methods to solve this issue, those new methods should probably be added to the Map data structure, not to Object, as this kind of data should be stored in a Map, not in an object. e.g. map.at(index) and map.size() would be better than Object.at(obj, index) and Object.size(). I would still be opposed to these, because of the second point I laid out earlier, but I think it's a step up from putting these on objects. I'd rather have something like map.randomEntry(), as that wouldn't temp people to abuse indexing.

That's also why I was suggesting that, perhaps, a new data structure, OrderedMap, would be able to solve this sort of issue. An OrderedMap would be like a Map, but it has a particular ordering, which means indexing (the way you do in arrays) would be an appropriate addition to it. Adding indexing to Map, or to objects, would be to try and treat these non-ordered collections as if they were ordered, which shouldn't happen.

2 Likes

Where did you get that from? I'm pretty sure a map's search algorithm is roughly as fast as an object's?

Since the get() function for Maps has to compare not only strings and symbols like the [ "" ] operator, but objects and other primitives, the code path is always longer. But they are so similar that I would not mind at all. (It would also gain more use cases, which has always been an advantage of Map over Object.)

That’s not accurate; the spec requires lookups be sublinear. For object keys I’d expect it to always be O(1).

I'll include this lookup-time comparison. (It demonstrates how difficult random selection is.)

Look how much you would stand to gain! Combine the 10ms key-searching of Objects and Maps with the 2ms random lookup for Arrays!

Right now, that is impossible specifically because the standard says it should be impossible.

Your data can only be accessed in one of two ways: For searching, or for non-linear selection. If you want both, you have to replicate the data in memory. Doesn't that sound absurd? In my example below, I have 10000 entries in memory, but particle systems and world tile states can easily have two orders of magnitude more.

JS forces you to choose between efficient searching and random selection. Not because the engines require it, but because the standard requires it.


Times for 100000 lookups with various methods in Firefox and Chrome on PC:

Object[k], Map.get(k), Array[i], Array.findIndex(k)

Firefox:
11, 16, 3, 12312
11, 16, 2, 11836
10, 16, 1, 11859

Chrome:
18, 16, 2, 5860
19, 15, 2, 5852
18, 16, 2, 5804

(Different codepaths will have different execution times, no matter what the spec says, so Object vs. Map will always have some difference.)

2ms. An order of magnitude faster.
That's how fast random lookups could be with a built-in function.

Objects' and Maps' blazing-fast key-searches are useless for random access, and that multi-second time for findIndex is why Arrays are useless for key-searching.


Comparison code:


const randomInt = n => Math.floor( Math.random()*n ),
    selectRandom = a => a[ Math.floor( Math.random() * a.length ) ],
    randomChar = () => String.fromCharCode( 33 + randomInt(26) ),
    randomString = n => new Array(n).fill(0).map( randomChar ).join(""),
    randomThing = () => selectRandom( [
        ()=>({}), //object
        ()=>randomInt(1000000), //int
        ()=>Math.random()>0.5, //bool
        ()=>Math.random(), //float
        ()=>randomString(randomInt(20)+1), //string
        ()=>Symbol(), //symbol
        ()=>(()=>{}), //function
        ()=>BigInt(randomInt(1000000)), //bigint
    ] )();

const keyCount = 10000;

const randomObject = {};
const randomMap = new Map();
const randomArray = new Array( keyCount );

for( let i=0; i<keyCount; i++ ) {
    randomObject[ randomString( randomInt(20)+1 ) ] = randomThing();
    randomMap.set( randomThing(), randomThing() );
    randomArray[ i ] = randomThing();
}

const objectKeys = Object.keys( randomObject );
const mapKeys = Array.from( randomMap.keys() );

const lookupIterations = 100000;

let overwriteMe = null;

const mapLookupInitialTime = performance.now();

for( let i=0; i<lookupIterations; i++ ) {
    const keyIndex = selectRandom( mapKeys );
    overwriteMe = randomMap.get( mapKeys[ keyIndex ] );
}

const mapLookupFinalTime = performance.now();

overwriteMe = null;

const objectLookupInitialTime = performance.now();

for( let i=0; i<lookupIterations; i++ ) {
    const keyIndex = selectRandom( objectKeys );
    overwriteMe = randomObject[ objectKeys[ keyIndex ] ];
}

const objectLookupFinalTime = performance.now();

overwriteMe = null;

const arrayLookupInitialTime = performance.now();

for( let i=0; i<lookupIterations; i++ ) {
    const keyIndex = randomInt( keyCount );
    overwriteMe = randomArray[ keyIndex ];
}

const arrayLookupFinalTime = performance.now();

const arrayEntriesToLookup = new Array( lookupIterations );

for( let i=0; i<lookupIterations; i++ ) {
    const keyIndex = randomInt( keyCount );
    arrayEntriesToLookup[ i ] = randomArray[ keyIndex ];
}
const arrayFindIndexInitialTime = performance.now();

for( let i=0; i<lookupIterations; i++ ) {
    overwriteMe = randomArray.findIndex( e => e === arrayEntriesToLookup[ i ] );
}

const arrayFindIndexFinalTime = performance.now();

console.log( "Object lookup total time: ", objectLookupFinalTime - objectLookupInitialTime );

console.log( "Map lookup total time: ", mapLookupFinalTime - mapLookupInitialTime );

console.log( "Array lookup total time: ", arrayLookupFinalTime - arrayLookupInitialTime );

console.log( "Array findIndex total time: ", arrayFindIndexFinalTime - arrayFindIndexInitialTime );

FWIW the hidden-class optimisation in many JS engines also have a limit of how many keys they will support before the object de-opts to being a hash map.

Firefox's property maps have 8 slots, with up to 127 linked maps (~1000 props), then an object will be converted to a dictionary property map. Dictionaries link until out-of-memory (which is as close to infinite as you can get).

But I've been reading SpiderMonkey / V8's source, and implementing this isn't straightforward. Properties are indexed, but not continuous. You need a pointer table. :-/
Implementing a pointer table in JS doesn't replicate data or discard arrays. Key-search is linear time with Map.get() and random entry lookup is linear with array[*].

Slower than native, but faster than any approach I've used before.
Definitely faster than trying to get the JS spec changed. :-)


Random Lookup Time Comparisons (1 million keys, 1 million lookups):

Object, Map, Array, Lexicon

Firefox:
233, 317, 9, 31
227, 319, 8, 30
227, 379, 8, 31
225, 377, 8, 31

Chrome:
335, 342, 12, 16
322, 328, 13, 16
329, 328, 12, 16
334, 329, 13, 16

Native Key Search with Random Lookup:


const Lexicon = (
    m = new Map(),
    a = new Array(),
    i = 0,
    deleted = Symbol(),
    self = {
        get: ( k, v = a[ m.get( k ) ] ) => v === deleted ? undefined : v,
        set: ( k, v ) => ( ++i, a[ i ] = v, m.set( k, i ), self ),
        delete: k => ( a[ m.get( k ) ] = deleted, self ),
        has: k => m.has( k ) && a[ m.get( k ) ] !== deleted,
        random: ( v = a[ Math.floor( Math.random() * a.length ) ] ) =>
            v === deleted ? self.random() : v,
    }
) => self;

1 Like

@0X-JonMichaelGalindo Your Lexicon implementation seems like a nice solution to this problem. Are there any popular packages that include a similar data structure? That would be useful evidence about how common this issue is. Personally, I can't remember ever needing something like this.

I'll tack on to this, since it seems most related to an issue I'm facing. I'm trying to implement a custom HTMLSelectElement. Specifically it's a listbox. I can replicate all of HTMLSelectElement's properties by use of [Symbol.iterator] for iterating the collection. JS language allows me to do this.

What I cannot do is implement HTMLSelectElement[index]. In spec, HTMLSelectElement[0] means the first child HTMLOptionElement. Everything works with .item() fine in my Listbox, but the objective is to basically be able to use any library that expects HTMLSelectElement and use it with my CustomListboxElement. If there's a library that uses selectElement[0], then I can't implement that.

Getting to the point, if we can have Symbol.Iterator, can we have Symbol.Indexer (we can bikeshed the name later). Essentially that covers my use cases and half of what's covered here (performance issue aside). I see we have made iteration generic, so index and size may also be done the same way.

Related: More formal definition of array index · Issue #900 · tc39/ecma262 · GitHub

Edit: As a side note, proxying doesn't work with customized HTMLElement.

For that, you’d use a Proxy.

I'm assuming you mean reshape the constructor to return a proxy of this with the indexes. That doesn't work with Custom Elements (I've tried). It's also seems rather, uh..., inelegant if that's direction we want to go even if it did work.

The point is collections lose their "magic" with Symbol.Iterator, but Array indexes don't have such a mechanism.

I’m confused why it doesn’t work with custom elements. I agree it’s inelegant, but it’s often inelegant to do exotic things like try to self-host a native type.

Array indexes are just property accesses - the entire language relies on that mechanism, and making that a protocol would both break tons of code and also likely slow the language to a permanent crawl.

Original poster here. After working with Espruino's low-RAM interpreter, I can confirm JS's property lookup powers are probably incompatible with the ordered behavior we expect from arrays.
At least, I cannot imagine any way to efficiently implement what I originally envisioned other than proxies (which are inherently non-performant).
Doesn't help here, I know.