On the precise analysis of C programs for FLT_EVAL_METHOD==2

Pascal Cuoq - 6th Jul 2013

There has been talk recently amongst my colleagues of Frama-C-wide support for compilation platforms that define FLT_EVAL_METHOD as 2. Remember that this compiler-set value, introduced in C99, means that all floating-point computations in the C program are made with long double precision, even if the type of the expressions they correspond to is float or double. This post is a reminder, to the attention of these colleagues and myself, of pitfalls to be anticipated in this endeavor.

We are talking of C programs like the one below.

#include <stdio.h> 
int r1; 
double ten = 10.0; 
int main(int c, char **v) 
{ 
  r1 = 0.1 == (1.0 / ten); 
  printf(1=%d"  r1); 
} 

With a C99 compilation platform that defines FLT_EVAL_METHOD as 0 this program prints "r1=1" but with a compilation platform that sets FLT_EVAL_METHOD to 2 it prints “r1=0”.

Although we are discussing non-strictly-IEEE 754 compilers we are assuming IEEE 754-like floating-point: we're not in 1980 any more. Also we are assuming that long double has more precision than double because the opposite situation would make any discussion specifically about FLT_EVAL_METHOD == 2 quite moot. In fact we are precisely thinking of compilation platforms where float is IEEE 754 single-precision (now called binary32) double is IEEE 754 double-precision (binary64) and long double is the 8087 80-bit double-extended format.

Let us find ourselves a compiler with the right properties :

$ gcc -v 
Using built-in specs. 
Target: x86_64-linux-gnu 
… 
gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1)  
$ gcc -mfpmath=387 -std=c99 t.c && ./a.out  
r1=0 

Good! (it seems)

The test program sets r1 to 0 because the left-hand side 0.1 of the equality test is the double-precision constant 0.1 whereas the right-hand side is the double-extended precision result of the division of 1 by 10. The two differ because 0.1 cannot be represented exactly in binary floating-point so the long double representation is closer to the mathematical value and thus different from the double representation. We can make sure this is the right explanation by changing the expression for r1 to 0.1L == (1.0 / ten) in which the division is typed as double but computed as long double then promoted to long double in order to be compared to 0.1L the long double representation of the mathematical constant 0.1. This change causes r1 to receive the value 1 with our test compiler whereas the change would make r1 receive 0 if the program was compiled with a strict IEEE 754 C compiler.

Pitfall 1: Constant expressions

Let us test the augmented program below:

#include <stdio.h> 
int r1  r2; 
double ten = 10.0; 
int main(int c  char **v) 
{ 
  r1 = 0.1 == (1.0 / ten); 
  r2 = 0.1 == (1.0 / 10.0); 
  printf("r1=%d r2=%d"  r1  r2); 
} 

In our first setback the program prints “r1=0 r2=1”. The assignment to r2 has been compiled into a straight constant-to-register move based on a constant evaluation algorithm that does not obey the same rules that execution does. If we are to write a precise static analyzer that corresponds to this GCC-4.4.3 this issue is going to seriously complicate our task. We will have to delineate a notion of “constant expressions” that the analyzer with evaluate with the same rules as GCC's rules for evaluating constant expressions and then implement GCC's semantics for run-time evaluation of floating-point expressions for non-constant expressions. And our notion of “constant expression” will have to exactly match GCC's notion of “constant expression” lest our analyzer be unsound.

Clarification: What is a “precise” static analyzer?

This is as good a time as any to point out that Frama-C's value analysis plug-in for instance is already able to analyze programs destined to be compiled with FLT_EVAL_METHOD as 2. By default the value analysis plug-in assumes IEEE 754 and FLT_EVAL_METHOD == 0:

$ frama-c -val t.c 
… 
t.c:9:[kernel] warning: Floating-point constant 0.1 is not represented exactly. 
 Will use 0x1.999999999999ap-4. 
 See documentation for option -warn-decimal-float 
… 
[value] Values at end of function main: 
          r1 ∈ {1} 
          r2 ∈ {1} 

The possibility of FLT_EVAL_METHOD being set to 2 is captured by the option -all-rounding-modes:

$ frama-c -val -all-rounding-modes t.c 
… 
t.c:9:[kernel] warning: Floating-point constant 0.1 is not represented exactly. 
 Will use 0x1.999999999999ap-4. 
 See documentation for option -warn-decimal-float 
… 
[value] Values at end of function main: 
          r1 ∈ {0; 1} 
          r2 ∈ {0; 1} 

The sets of values predicted for variables r1 and r2 at the end of main() each contain the value given by the program as compiled by GCC-4.4.3 but these sets are not precise. If the program then went on to divide r1 by r2 Frama-C's value analysis would warn about a possible division by zero whereas we know that with our compiler the division is safe. The warning would be a false positive.

We are talking here about making a static analyzer with the ability to conclude that r1 is 0 and r2 is 1 because we told it that we are targeting a compiler that makes it so.

The above example command-lines are for Frama-C's value analysis but during her PhD Thi Minh Tuyen Nguyen has shown that the same kind of approach could be applied to source-level Hoare-style verification of floating-point C programs. The relevant articles can be found in the results of the Hisseo project.

To be continued

In the next post we will find more pitfalls revisit a post by Joseph S. Myers in the GCC mailing list and conclude that implementing a precise static analyzer for this sort of compilation platform is a lot of work.

Pascal Cuoq
6th Jul 2013