Arithmetic overflows in Fluorine

Pascal Cuoq - 9th Jul 2013

There is a C quiz in the middle of this post, lost in the middle of all the reminiscing.

A history of arithmetic overflows in Frama-C

From the very beginnings in 2005, until after the first Open Source release in 2008, Frama-C's value analysis was assuming that all arithmetic overflows produced two's complement results. This seemed the right thing to do at the time.

Then an option was introduced to emit alarms on undefined overflows. John Regehr suggested it after testing one of the Open Source versions. The option was turned off by default. If a value analysis user turned the option on, any undefined arithmetic overflow in the program would be detected and reported as a possible error, with the same gravity as dereferencing NULL or accessing an array out of its bounds.

Later, a helpful reminder was added to the value analysis' output: in the default behavior of not emitting alarms, an informative message was emitted instead—if such an overflow was detected—about two's complement being assumed.

There was one last change in the last release, Fluorine. Actually, two changes: the name of the option for emitting alarms on undefined overflows changed, and the default value changed. The setting is now to emit alarms by default, and can be changed to not emitting them, for instance if the target code is destined to be compiled with gcc -fno-strict-overflow -fwrapv, in which case all overflows happening during execution can be expected to produce two's complement results.

One aspect remains unchanged in the above evolution: the discussion only applies to undefined overflows.

The philosophy was always to analyze programs as they were written, and not to force any change of habit on software developers. The initial choice not to warn about overflows was because we knew there were so many of these—most of them intentional—that we would be deluging the user with what would feel like a flood of false positives.

The gradual shift towards more arithmetic overflow awareness is a consequence of the fact that in C, some arithmetic overflows are undefined behavior. Compilers display increasing sophistication when optimizing the defined behaviors to the detriment of the predictability of undefined ones. To make a long story short, the “overflows produce 2's complement results” heuristic was wrong for some programs as compiled by some optimizing compilers.

In keeping with the same philosophy, “overflows” that are defined according to the C99 standard have always been treated by the value analysis plug-in with the semantics mandated by the standard. Those overflows that the standard says must have “implementation-defined” results are treated with the semantics that the overwhelming majority of compilation platforms give them (and it remains possible to model other architectures as the need arises).

A quiz

Other static analyzers may also warn for arithmetic overflows, but the philosophy can be different. The philosophy may for instance be that any overflow, regardless of whether it is defined or implementation-defined according to the letter of the standard, might be unintentional and should be brought to the attention of the developer.

In the few examples below, the goal is to predict whether Frama-C's value analysis with its new default setting in Fluorine would emit an alarm for an overflow. For extra credit, a secondary goal is to predict whether another static analyzer that warns for all overflows would warn. We assume a 32-bit int and a 16-bit short, same as (almost) everybody has.

int a = 50000; 
int b = 50000; 
int r = a * b; 
unsigned int a = 50000; 
unsigned int b = 50000; 
unsigned int r = a * b; 
int a = 50000; 
int b = 50000; 
unsigned int r = a * b; 
short a = 30000; 
short b = 30000; 
short r = a * b; 
unsigned short a = 50000; 
unsigned short b = 50000; 
unsigned int r = a * b; 

Answers

int a = 50000; 
int b = 50000; 
int r = a * b; 

There is an overflow in this snippet (mathematically, 50000 * 50000 is 2500000000, which does not fit in an int). This overflow is undefined, so the value analysis warns about it.

unsigned int a = 50000; 
unsigned int b = 50000; 
unsigned int r = a * b; 

The multiplication is an unsigned int multiplication, and when the mathematical result of unsigned operations is out of range, the C99 standard mandates that overflows wrap around. Technically, the C99 standard says “A computation involving unsigned operands can never overflow, …” (6.2.5:9) but we are using the word “overflow” with the same meaning as everyone outside the C standardization committee including Wikipedia editors.

To sum up in the C99 standard overflows in signed arithmetic are undefined and there are no overflows in unsigned arithmetic (meaning that unsigned overflows wrap around).

int a = 50000; 
int b = 50000; 
unsigned int r = a * b; 

The multiplication is again a signed multiplication. It does not matter that the result is destined to an unsigned int variable because in C types are inferred bottom-up. So the value analysis warns about an undefined overflow in the multiplication here.

short a = 30000; 
short b = 30000; 
short r = a * b; 

There is no overflow here in the multiplication because the last line behaves as short r = (short) ((int) a * (int) b);. The justification for this behavior can be found in clause 6.3.1 of the C99 standard about conversions and promotions (the general idea is that arithmetic never operates on types smaller than int or unsigned int. Smaller types are implicitly promoted before arithmetic takes place). The product 900000000 does fit in the type int of the multiplication. But then there is a conversion when the int result is assigned to the short variable r. This conversion is implementation-defined so the value analysis does not warn about it but another static analyzer may choose to warn about this conversion.

unsigned short a = 50000; 
unsigned short b = 50000; 
unsigned int r = a * b; 

Perhaps contrary to expectations there is an undefined overflow in the multiplication a * b in this example. Right in the middle of the aforementioned 6.3.1 clause in the C99 standard on the subject of the promotion of operands with smaller types than int the following sentence can be found:

If an int can represent all values of the original type the value is converted to an int; otherwise it is converted to an unsigned int.

All values of a 16-bit unsigned short fit a 32-bit int so the third line is interpreted as unsigned int r = (unsigned int) ((int) a * (int) b);.

Incidentally things would be completely different in this last example if int and short were the same size say if int was 16-bit. In this case the third line would be equivalent to unsigned int r = (unsigned int) a * (unsigned int) b; and would only contain an unsigned innocuous overflow.

Wrapping up

In Fluorine the option to activate or deactivate the emission of these undefined arithmetic overflow alarms is called -warn-signed-overflow (the opposite version for no alarms being -no-warn-signed-overflow). I felt that providing this piece of information earlier would have rendered the quiz too easy.

Although Frama-C's value analysis adheres to the semantics of C and only warns for undefined overflows it is possible to use Frama-C to check for the other kinds of overflows by using another plug-in Rte to automatically annotate the target C program with ACSL assertions that express the conditions for overflows. Note that that post pre-dates the Fluorine release and is written in terms of the old options.

Pascal Cuoq
9th Jul 2013