Math float optimisation

Proposal: GitHub - munrocket/proposal-math-float

JavaScript arithmetic is implemented in a non-standard way, compared to other languages. Over time, there was a demand for performance and precision but the language itself does not give an opportunity for some effective algorithms. This proposal offers to add inverse square, ldexp that multiplying by 2 raised to the exp power, frexp that decomposes given floating point value arg into a normalized fraction and an integral power of two, scalar FMA instruction that performs multiplication-addition in one step with single rounding, successor/predecessor of floating point number

Proposed syntax

let invsqrt = Math.invSqrt(x);
let ldexp = Math.ldexp(x);
let frexp = Math.frexp(x);
let fma = Math.fma(x, y, z);
let successor = Math.nextUp(x);
let predecessor = Math.nextDown(x);

Motivation to add this functions

  • invSqrt() is used in computer graphics, we have Math.hypot(x, y) in JS but it's not inverted
  • ldexp() currently have worst polyfill ever because it's based on slowest float operation (pow)
  • frexp() could be useful for bit analysis inside float number and correct rounding
  • fma() very useful for optimizing multiplication in arbitrary precision floating point libraries
  • nextUp()/nextDown() basic blocks for floating-point computations with correct rounding

Pollyfills status

Function Implementation
invSqrt done
ldexp done
frexp in npm?
fma without overflow for now
nextUp done
nextDown done

Disclaimer

This proposal based on my R&D projects (1, 2) with floating point numbers and created as alternative to decimal-proposal. I feel frustrated that JS not added this basic blocks 5 years ago and we at that point when something hardcore is computed not as fast as it could be. There are two type of arbitrary precision floating point libraries: based on integers and based on floats. Since javascript don't have integers historically I found that solution based on floats much easier in implementation and equally good as solutions based on integer arithmetic, when it properly cooked. invSqrt() / frexp() was added after inspiration with new WGSL specification, nextUp() / nextDown() exist in Java with same syntax, scalar fma() is the reason why this proposal was created.

2 Likes

Math.fma looks interesting

Yea, as I know you also tried to create polyfill for it in pure JS Math.fma - fused multiply–add (FMA) or fused multiply–accumulate (FMAC) in JavaScript · GitHub
We currently doesn’t have proper polyfill that pass all tests with numerical correctness. But we have one that ok when you numbers without overflow and it’s possible to create fully correct version too.

@munrocket, It is funny, that so complexity is required for polyfill, as if you know that you have no overflow/underflow and the third argument is the product of first two something simple works fine:

  // Veltkamp-Dekker's algorithm
  // see http://web.mit.edu/tabbott/Public/quaddouble-debian/qd-2.3.4-old/docs/qd.pdf
  var fma = function (a, b, product) {
    var SPLIT = 2**27 + 1;
    var at = SPLIT * a;
    var ahi = at - (at - a);
    var alo = a - ahi;
    var bt = SPLIT * b;
    var bhi = bt - (bt - b);
    var blo = b - bhi;
    var error = ((ahi * bhi + product) + ahi * blo + alo * bhi) + alo * blo;
    return error;
  };

of course, it is still interesting how fast the native fma is

1 Like

This exact paper that I used in my “research”. Ha-ha. Well.. we have similar thoughts. Maybe someone new will comment something. The most valuable opinion will be from a guy that implemented arbitrary precision library in any programming language of cause (like you) but any feedbacks are welcome.

I may be getting the wrong end of the stick, but I pulled this out of my library:

const arrayBuffer = new ArrayBuffer(8),
      dataView = new DataView(arrayBuffer);
export function ldexp( float, shiftCount )
{
    dataView.setFloat64(0,float);
    const lo16 = dataView.getUint16(0);
    dataView.setUint16( 0, lo16 + ( shiftCount << 4 ) );
    return dataView.getFloat64( 0 );
}

Obviously, it misses edge cases and can blow up in spectacular ways. But here's a fiddle measuring performance: http://jsfiddle.net/nc290xsm/1/ On Chrome, basic maths is 10 times slower than bit fiddling. On Firefox, bit fiddling is twice as fast as maths (now I've optimised it ;-).

As I understand your snippet trying to make the same as function below but with DataView’s

export function ldexp(x, exp) {
   if (exp >= 0) {
     return x * (1 << exp);
   } else {
     return x / (1 << -exp);
   }
}

This approach will not work with number higher than 2147483647 but max safe integer is 9007199254740991. To fix this overflow in this algorithm one view should affect to another. But how to detect overflow in view? Thats the question.

From my experience DataView is slow in browsers and I don’t know actually why. Whenever Dekker splitting algorithm can be used it should be used due to performance. So in spirit of your snippet we can try to change DataView to Dekker split and carefully calculate overflow. After that this thing will be probably better than Math.pow().

The function @fuchsia posted will work with much bigger exponents (±1023) than your integer version (±31). But it doesn't do any range checking, and doesn't handle denormals, so for example:

// because exponent has 11 bits and this overwrites the 12th (sign bit)
ldexp(1, 2048) === -1; // incorrect

// numbers smaller than 2 ** -1023 are denormal
ldexp(1, -1050) !== 2 ** -1050; // incorrect
ldexp(2 ** -1051, 1) !== 2 ** -1050; // incorrect

Btw @fuchsia: pushing each result onto an array makes that a benchmark on Array.prototype.push. For me on Firefox your viaLdExp is twice as fast as viaPow, but if I simply sum the results instead of pushing them to array, it's more than four times as fast.

Don't try it on non-finite values or 0, either!

A sum sounds a fairer test, although Firefox seems very sensitive to the length of the code path - masking the sign bit made little difference to Chrome's performance but Firefox hated it.

The performance issues with Dataviews are a thing of the past; the chrome team fixed it a couple of years back. But thinking about this over the weekend, I realised that forcing numbers to be little-endian is probably a performance hit. On x86, this is fractionally faster in Chrome:

const arrayBuffer = new ArrayBuffer(8),
      nativeFloat = new Float64Array(arrayBuffer),
      nativeUint16 = new Uint16Array(arrayBuffer);
function ldexp1( float, shiftCount )
{
    nativeFloat[0] = float;
    nativeUint16[3] += shiftCount << 4;
    return nativeFloat[0];
}

(Performance is about the same in Firefox.) But you'd need to probe the byte order of the platform at startup and adapt to the endianness or fall back to the original version, if you can't decipher it. And you probably need to test all this delivers a real performance benefit...

All of which illustrates how frustrating it is that Math.ldexp() is missing. It's a function that can be efficiently implemented as either addition or a bit shift (denormals) and which is available in libc, yet implementing it from Javascript requires detailed knowledge of the hardware and the engine and produces fragile code. Ditto frexp, nextUp and nextDown except that whereas an engine could theoretically spot x * 2**n is ldexp(x,n), it's unlikely to be able to recognise the others.

So you not changing mantissa and just changing exponent? That's clever.
Here library with bit manipulations. Seems that it cover edge cases with infinities and NaNs. math-float64-ldexp - npm

It is interesting, that Math.pow(2, n) and 2**n are not optimized now in Chrome.

If you feel strongly about it, consider filing an engine bug against whatever engines aren't optimizing for it.

1 Like

It's not a bug in engine. They can optimise pow(x, 2) as x*x or pow(2, x) as ldexp(x), but why they should to add additional if statement into it? Common sense is to have ldexp directly into language, but something went wrong.

well... the optimization may happen at "compile time"

I think, frexp is useful, may be it is better to have Math.getExponent as in Java.

1 Like
  1. The cost of an extra highly-predictable if statement in this sea of conditionals is almost nothing. (Most JS engines, and languages in general, use fdlibm for floating point arithmetic.)
  2. It's not that hard to create a new helper and wire up JIT code to call that in easily-detected special cases like that.
1 Like

Can someone share some tutorial how to modify Chrome/Firefox directly in C/Rust? I want to create prototype and measure performance by myself. Because it seems that I need some charts and results to make this proposal more clear.

For Chrome and Node, V8's pretty easy to onboard development with (and doesn't require a full browser checkout either): Documentation · V8

I'd recommend starting there if you want to experiment with perf optimizations.

Note: C++ knowledge is pretty important for working with it - you can easily find tutorials and such online for it.