Funny floating-point bugs in Frama-C Oxygen's front-end

Pascal Cuoq - 19th Nov 2012

In a previous post almost exactly one year ago before Frama-C Oxygen was released I mentioned that the then future release would incorporate a custom decimal-to-binary floating-point conversion function. The reason was that the system's strtof() and strtod() functions could not be trusted.

This custom conversion function is written in OCaml. It can be found in src/kernel/floating_point.ml in the now available Oxygen source tree. This post is about a couple of funny bugs the function has.

History

There had been arguments about the inclusion of the “correct parsing of decimal constants” feature in Frama-C's front-end and about the best way to implement it. My colleague Claude Marché was in favor of using the reputable MPFR library. I thought that such an additional dependency was unacceptable and I was against the feature's inclusion as long as I could not see a way to implement it that did not involve such a dependency.

When I presented my dependency-avoiding solution to Claude he said: “But how do you know your function works? Did you prove it?”. To which I replied that no I had not proved it but I had thoroughly tested it. I had written a quick generator of difficult-to-convert decimal numbers and I was rather happy with the confidence it gave me.

My generator allowed me to find and fix a double rounding issue when the number to convert was a denormal: in this case the number would first be rounded at 52 significant digits and then at the lower number of significant digits implied by its denormal status.

I owe Claude a beer though because there were two bugs in my function. The bugs were not found by my random testing but would indeed have been found by formal verification. If you want to identify them yourself stop reading now and start hunting because the bugs are explained below.

Stop reading here for a difficult debugging exercise

Say that the number being parsed is of the form numopt1.numopt2Enum3 where numopt1 and numopt2 expand to optional strings of digits and num3 expands to a mandatory string of digits.

The sequences of digits numopt1 and numopt2 can be long. The string numopt2 in particular should not be parsed as an integer because the leading zeroes it may have are significant.

At this point of the parsing of the input program we have already ensured that num3 was a string of digits with an optional sign character at the beginning. In these conditions it is tempting to simply call the OCaml function int_of_string. The function int_of_string may still fail and raise an exception if the string represents a number too large to be represented as a 31- or 63-bit OCaml int.

This is easy to fix: if the program contains a literal like 1.234E9999999999999999999 causing int_of_string to fail when parsing the exponent return infinity. A vicious programmer might have written 0.000…01E9999999999999999999 but this programmer's hard-drive is not large enough to contain all the digits that would prevent infinity to be the correct answer.

Similarly if int_of_string chokes because the programmer wrote 1.234E-9999999999999999999 the function can safely return 0.0 which for the same reason is always the correctly rounded floating-point representation.

Or so it would seem. The above logic is implemented in function parse_float inside Frama-C Oxygen and this is where the bugs are.

Stop reading here for an easy debugging exercise

During a code review my colleague Boris Yakobowski and I found that Oxygen had the following unwanted behaviors. Read on for the solution.

Exhibit one: spurious warning

double d = 0.0E-9999999999999999999; 

For the program above a warning is emitted whereas the constant is in fact exactly represented. The only issue here is the spurious warning:

$ frama-c e1.c 
[kernel] preprocessing with "gcc -C -E -I.  e1.c" 
e1.c:1:[kernel] warning: Floating-point constant 0.0E-9999999999999999999 
is not represented exactly. 
Will use 0x0.0000000000000p-1022. 
See documentation for option -warn-decimal-float 

Exhibit two: confusion between zero and infinity

double d = 0.0E9999999999999999999; 

This bug is more serious:

$ frama-c e2.c 
[kernel] preprocessing with "gcc -C -E -I.  e2.c" 
e2.c:1:[kernel] warning: Floating-point constant 0.0E9999999999999999999 
is not represented exactly. 
Will use inf. 
See documentation for option -warn-decimal-float 

These two related bugs are fixed in the development version of Frama-C.

Pascal Cuoq
19th Nov 2012