Builtin Ord / Compare method for primitives

MaxGraey · April 5, 2021, 9:12pm

string, object, or NaN.

NaN is Number. This means it should properly sort with other numbers but a - b return NaN if a or b also NaN and this perform as false for any condition afterwards. But Float64Array#sort perform this properly but can't be propagate (exposed) to usual Array of numbers or Array of objects which should sort by some key number. Main idea is adding this mechanism in user space as separate helper, so:

new Float64Array([3, NaN, 2, NaN, 1]).sort()
// > [1, 2, 3, NaN, NaN]

and

[3, NaN, 2, NaN, 1].sort(Number.compare)
// > [1, 2, 3, NaN, NaN]

will return same consistent results

claudiameadows · April 6, 2021, 12:39am

Just in case it wasn't clear, we're only talking about type-specific comparison, not generic comparison (which I agree there's no good, reasonable option for).

MaxGraey · April 6, 2021, 9:44am

I won't deny that there are times when +0 and -0 behave differently, and it can be useful to distinguish the two values. I've had to do so before. But, I don't necessarily see that as a reason for wanting to sort +0 higher than -0. If we went back to a hypothetical priority queue example, and assume we use this Number.compare() algorithm under the hood, then we would get the following odd outcome:

queue = new PriorityQueue()
queue.add('a', { priority: -0 })
queue.add('b', { priority: +0 })

I guess PriorityQueue will use integer values for priority also you still can use a - b as comparator. I don't think it's a real blocker.

I'm wondering why you don't want make JavaScript more consistent language? Why all other langs Haskell, Rust, Go and etc can sort numbers with NaNs and -0, +0 in proper order but JavaScript still lack of this functionality (except Float32Array and Float64Array). At least optionally as suggested in proposal. If you don't want use Number.compare and BigInt.compare you could still use something custom. While sortBy is a pretty tightly regulated API that doesn't solve the mentioned problems at all.

theScottyJam · April 6, 2021, 7:51pm

This sortBy function is easy to put together in userland code, and is able to replace all but the number-comparison functions using the examples I posted before - unless you don't think those examples properly replicate the behavior of these type-specific primitive sorting functions? Am I missing something that these functions do?

The only reason sortBy can't replace the Number.compare() easily is precisely because it's inconsistent with the built-in comparison operators "<", ">", and "===". In my mind, I'm actually arguing for consistency here :p

What I'm against is adding something to the language that no one needs. I still haven't been able to think of a use case for this kind of sorting algorithm - in fact, the +0/-0 thing seems like a foot-gun. For a feature to be added to a language, there needs to be a use case for it.

As an aside, the reason I'm ok with the committee's decision to have Float32Array sort +0 and -0 differently, and put NaN at the end is this: If it followed a normal "a - b" sorting algorithm, then values such as +0 and -0 would be together in a random order, and NaN would mess everything up (as you've mentioned before). Soring -0 before +0 just adds a bit more determinism to the sorting algorithm, and putting NaN at the end seems like a reasonable choice if they don't want sort() to throw an error. If the array's sort algorithm did the same thing by default, I would be ok with that too. And, if this Number.compare function was only ever used like this: myArray.sort(Number.compare), and was never used in any other context, then I would be ok with that too, for the same reasons. But the fact that we're exposing a numerical comparison function that claims -0 is less than +0 instead of equal to each other has now brought us out of the realm of simply adding determinism to a sorting algorithm. The Float32Array sorting algorith was effectivly saying "-0 is equal to +0, but sort all -0s before +0s anyways to make it more deterministic". But now, we're proposing to create a function that claims "-0 is less than +0". And, if this function is ever used anywhere besides as the sorting function in array.sort(), then we're allowing for issues to happen, as described with the priority queue example.

Don't get me wrong - I'm not flat-out against adding Number.compare() to the language despite what this all may sound like - The utility of a new feature can outweigh the inconsistencies and negative consequences it brings (every new feature always brings some downsides with it). I just want to know that this additional function is a useful enough addition to the language to warrant the downsides, and right now, I haven't seen any use cases for it.

graphemecluster · August 20, 2022, 5:15am

@ljharb @theScottyJam
Although it’s more than a year ago, I would like to point out that a language-independent string comparison method that compares code point instead of code unit would be very useful – unlike code unit ((a, b) => (a > b) - (a < b)), this is not a thing that can be done effortlessly:

/**
 * Compare two strings by (full) Unicode code point order.
 * @param {string} x
 * @param {string} y
 * @returns {number}
 */
function compareFullUnicode(x, y) {
  const ix = x[Symbol.iterator]();
  const iy = y[Symbol.iterator]();
  for (;;) {
    const nx = ix.next();
    const ny = iy.next();
    if (nx.done && ny.done) {
      return 0;
    }
    const cx = nx.done ? -1 : nx.value.codePointAt(0);
    const cy = ny.done ? -1 : ny.value.codePointAt(0);
    const diff = cx - cy;
    if (diff) {
      return diff;
    }
  }
}

(from https://github.com/nk2028/rime-dict-builder/blob/035fb661711545f97b4f8745c171b4e33fc4611f/build.js#L15-L37)

Why it’s useful?

MaxGraey · August 25, 2022, 8:19am

Hmm, it looks like your compareFullUnicode exacly the same as (a > b) - (a < b). Could you give me an example of when they have different sorting results?

const arr = [
  "A",  "B",  "C", "Č", "Ć",  "D",
  "Dž", "Đ",  "E", "F", "G",  "H",
  "I",  "J",  "K", "L", "Lj", "M",
  "N",  "Nj", "O", "P", "R",  "S",
  "ÛŒ", "T",  "U", "ℍ",

  // has surrgogate pairs
  "🄰", "🄲", "🄱"
];

console.log("naive sort:");
console.log(arr.slice(0).sort((a, b) => (a > b) - (a < b)));
console.log("");

console.log("unicode-aware sort:");
console.log(arr.slice(0).sort(compareFullUnicode));
console.log("");

console.log("localeCompare sort:");
console.log(arr.slice(0).sort((a, b) => a.localeCompare(b)));
console.log("");


console.log("Intl.Collator sort:");
console.log(arr.slice(0).sort(new Intl.Collator('en').compare));
console.log("");

Output:

naive sort:
[
  'A', 'B',  'C', 'D', 'Dž', 'E',
  'F', 'G',  'H', 'I', 'J',  'K',
  'L', 'Lj', 'M', 'N', 'Nj', 'O',
  'P', 'R',  'S', 'T', 'U',  'ÛŒ',
  'Ć', 'Č',  'Đ', 'ℍ', '🄰',  '🄱',
  '🄲'
]

unicode-aware sort:
[
  'A', 'B',  'C', 'D', 'Dž', 'E',
  'F', 'G',  'H', 'I', 'J',  'K',
  'L', 'Lj', 'M', 'N', 'Nj', 'O',
  'P', 'R',  'S', 'T', 'U',  'ÛŒ',
  'Ć', 'Č',  'Đ', 'ℍ', '🄰',  '🄱',
  '🄲'
]

localeCompare sort:
[
  'A',  '🄰', 'B',  '🄱', 'C',  '🄲',
  'Ć',  'Č', 'D',  'Đ', 'Dž', 'E',
  'F',  'G', 'H',  'ℍ', 'I',  'J',
  'K',  'L', 'Lj', 'M', 'N',  'Nj',
  'O',  'P', 'R',  'S', 'T',  'U',
  'ÛŒ'
]

Intl.Collator sort:
[
  'A',  '🄰', 'B',  '🄱', 'C',  '🄲',
  'Ć',  'Č', 'D',  'Đ', 'Dž', 'E',
  'F',  'G', 'H',  'ℍ', 'I',  'J',
  'K',  'L', 'Lj', 'M', 'N',  'Nj',
  'O',  'P', 'R',  'S', 'T',  'U',
  'ÛŒ'
]

graphemecluster · August 25, 2022, 1:10pm

That’s because your data doesn’t contain any character ranged from U+E000 to U+FFFF – surrogate pairs range from U+D800 to U+DFFF. Try ["Ａ", "🄰", "Ｂ", "🄱", "Ｃ", "🄲"].

And this method is not intended to sort any alphabet system most of the time. In our case, we’re making a multilingual input method that outputs Han characters. We must keep the order language-independent because our input method can be used in multiple languages like Chinese, Japanese and Korean.

Topic		Replies	Views
Proposal idea: descending sort 🦋 Proposals	9	841	December 10, 2024
Make "sort" great again! 💡 Ideas	6	105	April 15, 2025
O(N) Sort Data Structures 💡 Ideas proposal	5	338	January 2, 2021
A key parameter for Array.prototype.sort 💡 Ideas	7	452	November 5, 2022
More sort comparison functions 💡 Ideas proposal	3	537	September 15, 2020

Builtin Ord / Compare method for primitives

Why it’s useful?

Related topics