Friday, July 17, 2009

Avoiding Excess Floating-Point Precision

The title of this post might sound weird. You may wonder why anyone would need something like that. Well, I have found myself in such a situation recently.

The JagPDF's regression test suite revealed that a PDF file generated by the same test script differs between debug and release JagPDF versions. This happened on x86/Linux, the files were the same on Windows. A closer look at the contents of the generated files spotted the actual difference: the last digit of some floating-point numbers.

I managed to replicate that behavior with the following simple C++ program:

#include <stdio.h>
#include <math.h>

double my_round(double val, double precision)
{
    return precision * (val > 0 ? floor(val / precision + .5)
                                : ceil(val / precision - .5));
}

int main()
{
    double rounded_val = my_round(91.796875, .00001);
    printf("%lf\n", rounded_val);
    return 0;
}

Its output differs depending on whether the -O2 optimization is enabled or not.

$ g++ app.cpp && ./a.out
91.796880
$ g++ -O2 app.cpp && ./a.out
91.796870

I admit I wasn't happy about it since I didn't feel like studying this document in depth.

Fortunately, after playing with the code a bit and after some research, I discovered that this discrepancy is caused by the fact that on some architectures (including x86/Linux) the floating-point registers keep more precision than a double is supposed to have. So the result of a floating-point expression depends basically on how the compiler allocates floating-point registers.

In JagPDF, I do not need a numerically accurate result for this particular case. What I actually need is the very same result regardless of the optimization settings.

So, what can be done about this?

I dismissed the -ffloat-store gcc option which causes that floating-point variables are not stored in registers. Enabling this option involves a big performance hit since it is a global option which causes spilling data from floating-point registers to memory.

There is a handful of other relevant options: -mpc32, -mpc64, -mpc80, -fexcess-precision, and maybe some others I do not know of. However, none of them helped.

The solution I used was the volatile keyword. The value of a volatile variable is loaded from memory to a register before each operation on that variable. So the new version of my_round() looks like this:

double my_round(double val, double precision)
{
   volatile double d = val / precision;
   return precision * (val > 0 ? floor(d + .5) : ceil(d - .5));
}

Now, the output is the same in both cases:

$ g++ app.cpp && ./a.out
91.796880
$ g++ -O2 app.cpp && ./a.out
91.796880

So, why it works now?

The key thing is that the result of the val/precision expression is, thanks to volatile, rounded to the standard precision before it is passed to floor() or ceil(). Without volatile, the optimized code can pass the full precision intermediate result without storing it to memory.

Here is the generated machine code (slightly edited for easier reading):

       original                     volatile
    ----------------------------------------------
1   fldl   $val                 fldl   $precision
2   fldl   $precision           fldl   $val
3   fdivr  %st,%st(1)           fdiv   %st(1),%st
4                               fstpl  -0x10(%ebp)
5                               fldl   -0x10(%ebp)
6   fxch   %st(1)
7   fadds  0.5                  fadds  0.5

An unimportant difference is that the operands are loaded into FPU in reverse order. The crucial difference is on lines 4 and 5 where the rounding occurs: fstpl stores the intermediate result to memory and fldl loads it immediately back to FPU.

3 comments:

pedcdrl76c said...

Giving the participant more decisions says so much concerning the casino’s status and customer care. Of course, being an trade chief they won’t dare not embody nice online roulette video games with 1xbet European, American, and tournament variations throughout quantity of} prime suppliers. Now, you realize every thing about online roulette, its algorithm, advantages, and how you can play it with demo money.

nvkhpznk2a said...

Mississippi is bordered to the north 더킹카지노 by Tennessee, to the east by Alabama, to the south by the Gulf of Mexico, to the southwest by Louisiana, and to the northwest by Arkansas. Today’s API and application-borne assaults are an enormous threat to day-to-day operations and revenue. By legislation, the casino have to be positioned on the Schoolfield web site, since that's the location that was listed in the casino referendum question Danville voters passed in November 2020, Larking stated.

dtt8u8q8c7 said...

Spin Casino accepts Visa/MasterCard, Visa Electron, Bank 메리트카지노 Transfers, Interac, Skrill, Trustly, iDebit, and Neteller. Check out some popular video games at Spin Casino on-line gambling web site, including 9 Pots of Gold, Halloween, Avalon, Ivory Citadel, Thunderstruck II, Retro Galaxy, Tiki Mania, and many of|and plenty of} others. Most supply greater than their single awards when collaborating in weekly and every day tournaments. Each real cash guess additionally creates loyalty points, which may be exchanged for additional credit. There is nothing to be involved about phrases of|in relation to} the security and safety of Spin Casino. Its license is the primary and most crucial part that establishes its legitimacy.