Overflow alarms vs informative messages about 2's complement

Pascal Cuoq - 27th Mar 2012

A privately sent question may deserve a quick blog post.

Context

The example is as below:

int x, y, s; 
main(){ 
  x = Frama_C_interval(0, 2000000000); 
  y = 1000000000; 
  s = x + y; 
  Frama_C_dump_each(); 
} 

Sorry for all the zeroes. There are nine of them in each large constant, so that the program is flirting with the limits of 32-bit 2's complement integers. Frama-C's value analysis with default options finds the following to say:

$ frama-c -val share/builtin.c o.c 
... 
o.c:6:[value] warning: 2's complement assumed for overflow 
o.c:6:[value] assigning non deterministic value for the first time 
[value] DUMPING STATE of file o.c line 7 
        x ∈ [0..2000000000] 
        y ∈ {1000000000} 
        s ∈ [--..--] 

There is an option, -val-signed-overflow-alarms, to change the behavior of the value analysis with respect to integer overflows:

$ frama-c -val -val-signed-overflow-alarms share/builtin.c o.c 
... 
o.c:6:[kernel] warning: Signed overflow. assert x+y ≤ 2147483647LL; 
o.c:6:[value] assigning non deterministic value for the first time 
[value] DUMPING STATE of file o.c line 7 
        x ∈ [0..2000000000] 
        y ∈ {1000000000} 
        s ∈ [1000000000..2147483647] 

The warning with the first command was introduced in Nitrogen, and this post therefore is in a way part of the series describing new features in Nitrogen.

The new warning, “2's complement assumed for overflow”, is emitted in exactly the same circumstances the alarm “assert x+y ≤ 2147483647LL” would have been emitted if option -val-signed-overflow-alarms had been used.

Question and answer

The question is whether the warnings obtained in each case, “2's complement assumed for overflow” and “Signed overflow. assert x+y ≤ 2147483647LL” respectively, mean the same thing. The answer is that no, they don't. The consequences of the differences between them can be seen in the values computed for variable s after the overflow.

  1. The warning “2's complement assumed for overflow” is a gentle reminder that the value analysis is unable to establish an overflow does not happen at line 6. The value analysis is, the message kindly informs, going to assume that the overflow, if it happens at all, happens on purpose. The programmer may for instance have read that blog post in addition to this one and use the appropriate options to force GCC to give 2's complement results on signed overflows. The message considered here is informative only. The user should is expected to react with “ah yes that's right my program overflows on purpose here” and go on with some of eir analysis. Only two billion and one values can actually be obtained as a result of this 2's complement addition [-2147483648..-1294967296] ∪ [1000000000..2147483647] but since the value analysis can only represent large sets of integers as a single interval the end result is that all int values appear to be possible values for variable s.
  2. The warning “Signed overflow. assert x+y ≤ 2147483647LL” contains the word “assert”. It is an alarm. The user should do something about it and in this case the user should realize ey is trying to prove that there are no undefined signed overflows in a program that really contains undefined signed overflows. Something is seriously wrong in the target program! The value analysis wants to be as precise as possible for the rest of the program so without deciding of the status of the alarm it assumes for the rest of the propagation that it was false. If the overflow alarm is false then s is guaranteed to be in the range [1000000000..2147483647]. This is a subset of the values that are possible with 2's complement addition. Now the value analysis also reserves the right to reduce after the alarm the values for variable x to the range [0..1147483647] since these are the only values that do not cause an undefined overflow. It does not take advantage of this information at this level for now but a future version will. The reasoning here is the same as the reasoning in Q1 and Q2 in the Frequently Asked Questions section of the manual.

It may seem strange that in the second case the value analysis does not affirm more strongly that there is a definite problem with some of the values in the user-specified Frama_C_interval(0 2000000000) range for variable x but that's not what it is optimized for. It is optimized for verifying that there aren't any issues not so much for reliably telling the user what original inputs cause the issues. User expertise or alternative techniques must take up from there. Think of the range [0..2000000000] for x as the result of previous computations without it being known whether the interval is so wide because the values can really happen with user-specified inputs or because of earlier over-approximations.

Stéphane Duprat Florent Kirchner and Philippe Herrmann proofread this post.

Pascal Cuoq
27th Mar 2012