← ptrdiff_t links for the week-end But wait! There may be more! →

Minimizing the number of alarms emitted by the value analysis

Pascal Cuoq - 12th Mar 2012

This blog post describes for the first time some options that became available in Nitrogen, continuing this series. This post also offers a benchmark but the results of this benchmark cannot be reproduced with Nitrogen. They should be reproducible with the next version of Frama-C.

Alarms false alarms and redundant alarms

Frama-C's value analysis computes two results for further processing: values and alarms. In this post we assume that the list of alarms is what the user is interested in. That is eir objective is to obtain a list of conditions (alarms) that ey will only need to check never happens in order to be assured that the program is defined. Defined in the sense of the value analysis that is. Understandably the user would like to spend as little time and memory as possible to obtain this list. However because human or computation time is going to be spend further processing each alarm ey would like the list to be as short as possible. Ey is willing to spend some extra time and memory shortening the list within reason.

The user still wants the list to be exhaustive. One valid way to shorten the list is to remove false alarms from it if the extra time and memory can be used to determine that some of the alarms are false alarms. Another alarm that can validly be removed from the list is a redundant alarm that cannot happen as long as another previous alarm is verified not to happen first. One example of a redundant alarm that the value analysis cannot easily remove by itself is provided in the manual:

int A  B; 
void main(int x  int y) 
{ 
  A = 100 / (x * y); 
  B = 333 % x; 
}

This example assumes signed overflows give 2's complement results (and should not be considered as dangerous). With these settings the value analysis emits two alarms:

$ frama-c -val div.c 
div.c:4: ... assert (int)(x*y) ≢ 0; 
div.c:5: ... assert x ≢ 0;

The alarm at line 5 is redundant with the alarm at line 4. The alarm at line 5 could theoretically be removed because if x*y is non-nil then x is guaranteed to be non-nil too (and the reasoning applies even if “x*y is non-nil” is taken to mean that the product is non-nil modulo 2^32).

Unrelated exercise for the motivated reader: through command-line options and program annotations guide the analyzer so that it only emits the alarm at line 4 without making any assumption that it wasn't making in the original analysis.

Example: a string manipulation function

This example was provided by some of my colleagues. More on that later. For now it suffices to say that the set-up is perfect: the command frama-c -metrics gd_full_bad.c shows that it only needs Frama_C_interval() provided in a file builtin.c with Frama-C. The analysis runs with frama-c -val .../share/frama-c/builtin.c gd_full_bad.c.

The basic approach

The green belt user having read the tutorial knows that after checking that eir analysis project does not call any missing functions and doing an imprecise analysis if said imprecise analysis is fast enough ey has the option to run the analysis again with option -slevel for a chance to obtain more precise results. More precise results means fewer alarms so a good first benchmark is to measure the effects of various values of -slevel on analysis time and how many alarms these analysis seconds gain us.

for i in 0 10 100 1000 10000 100000  
do 
  /usr/bin/time ~/ppc/bin/toplevel.opt ... -slevel $i > log_slevel_$i 2> time_slevel_$i 
done

The table below gives for each value used for the -slevel option the analysis time and the number of alarms emitted.

slevel    alarms    time(s) 
     0     13         0.61 
    10     13         0.62 
   100     13         1.0 
  1000     12         6.0 
 10000      9       105 
100000      1       441

The single alarm that remains even with -slevel 100000 is at a line preceded by the comment /* BAD */ in the source code. With the information implicitly carried in the name of the file we can assume this one to be a true alarm intended to be found and that will not go away with any value of -slevel. Further assuming there are no other bugs to be found in this code (which is what the value analysis claims) one alarm is the optimal result and it is found in exchange for a wait of a little bit more than 7 minutes. The same alarm is present in the lists of 9 12 or 13 alarms found with the other values of -slevel together with respectively 8 11 and 12 false or redundant alarms.

Improving results with advanced options

Improving results by doing less

The advanced user knows that when you are not interested in the “values” component of the value analysis' results you can save a sometimes significant amount of time by instructing the analyzer not to save them for you. An hitherto less documented option is -no-val-show-progress that similarly saves the analyzer the bother of printing progression messages for when no-one is going to read them.

slevel    alarms    time(s) 
     0     13         0.45 
    10     13         0.47 
   100     13         0.9 
  1000     12         5.5 
 10000      9        97 
100000      1       439

As you can see the difference is almost lost in the noise which means that I should probably compute confidence intervals and/or boot some of the other users out of the server I am using for this informal comparison. But I'm not going to. The shell commands in this blog post are among other things so that I can run everything again at a quieter time.

Improving results by doing more

Little-known option -remove-redundant-alarms takes some time at the end of the analysis to detect whether some alarms are syntactically identical to prior alarms. If the variables involved in both alarms were left unmodified by the program in-between the latter alarm is identified as redundant and erased from the alarms database.

Detecting whether some alarms are redundant in this manner requires the “values” part of the value analysis' results to resolve pointers so if we use this option we cannot use option -no-results. For the good that one did us option -remove-redundant-alarms is worth a try.

Note that -remove-redundant-alarms does not retroactively remove emitted alarms from the log. You can obtain the filtered list programmatically by looking for it in the GUI or with option -print.

Running again the previous tests with option -remove-redundant-alarms the difference in computation time is lost in the noise when compared to the timings without option -no-results. For this negligible price one assertion is removed from the list of 13 obtained with values 0 10 and 100 of option -slevel.

One assertion removed as redundant may not seem like much but it means that for this particular example if you are starting from -slevel 0 and are willing to spend just enough additional resources to go from 13 to 12 alarms you are better off investing in -remove-redundant-alarms that will take 0.10s to get you there than in -slevel 1000 that will require 5s of additional computation time for the same result.

$ ~/ppc/bin/toplevel.opt ... -remove-redundant-alarms -print > lrm 
$ ~/ppc/bin/toplevel.opt ... -print > lpl 
$ diff -u lpl lrm 
--- lpl	2012-03-12 21:38:54.000000000 +0100 
+++ lrm	2012-03-12 21:38:49.000000000 +0100 
@@ -42 6 +42 9 @@ 
 gd_full_bad.c:143:[value] Assertion got status unknown. 
 gd_full_bad.c:46:[value] warning: 2's complement assumed for overflow 
 [value] done for function main 
+[from] Computing for function search 
+[from] Done for function search 
+[scope] [rm_asserts] removing 1 assertion(s) 
 [value] ====== VALUES COMPUTED ====== 
 [value] Values at end of function Frama_C_interval: 
           Frama_C_entropy_source ∈ [--..--] 
@@ -293 9 +296 6 @@ 
           } 
         } 
         len = 1; 
-        /*@ assert \valid{Here}(string+next); 
-              // synthesized 
-           */ 
         byte = (int)((unsigned char)*(string + next)); 
         if (byte < 0xC0) { len = 1; } 
         else {

As this log-level comparison shows with option -remove-redundant-alarms the same alarms are initially found (there are no differences there). With the option Frama-C works harder (lines starting with + indicate work that was done only when the option was present) and removes one alarm from the final printout (lines starting with -). The assertion \valid{Here}(string+next) was found to be redundant with a previous syntactically identical assertion.

Trying to improve results by doing things differently

Experiments with using the value analysis as a C interpreter that have been alluded to in this blog led to the introduction in Nitrogen of value analysis option -obviously-terminates. I have previously described this option as emulating infinite slevel and disabling fixpoint detection which is one way to see what it does. More accurately it disables the detection of previously-propagated states. Whereas one may associate fixpoint detection with the analysis of loops the detection of previously seen states normally occurs even outside of loops. It is this detection that is disabled by the option and therefore the option also influences the analysis of loop-free programs.

The “obviously” in the name of the option is a reminder that it will only be useful for programs that terminate very obviously. Fixpoint detection and widening are the two techniques through which the value analysis terminates when analyzing programs that themselves do not terminate or do not clearly terminate. Option -obviously-terminates disables both. The consequence is that the analyzer can loop forever. When it is used to turn the value analysis into a C interpreter it makes it loop forever on C programs that loop forever — like an interpreter does. If the option is used outside “interpreter constraints” (for instance if there is initial imprecision on some inputs or if some functions are missing) then the value analysis may even loop while analyzing programs that in reality terminate for all their inputs. Lastly one should keep in mind that it is not enough that the termination of the analyzed program is obvious for the user. It needs to be obvious for the analyzer.

An example

Consider the following loop-free program:

int x  y  z  t  u  ...; 
int main (int c  char **v) 
  x = c & 1; 
  if (x) 
    x = 1; 
  else 
    x = 2; 
  y = x + 1; 
  z = y; 
  t = z - 1; 
  u = t; 
  ...

When this program is analyzed with -slevel 2 or more the analyzer spends its time at each statement wondering whether the state coming from the initial then branch has been propagated into something identical to the state coming from the initial else branch. This kind of test can be expensive especially when programs with many execution paths are analyzed with high values of -slevel. Also on this program or a program similar to this one making the test at each statement is a complete waste of time: two states will never be found to be identical if only because the state coming from the then and else branches always have different values for variable x.

Option -obviously-terminates disables these time-wasting tests and makes the analysis of a program like this one much faster.

Another very slightly different example

Consider the following following variation of the previous program:

int x  y  z  t  u  ...; 
int main (int c  char **v) 
  x = c & 1; 
  if (x) 
    x = 1; 
  else 
    x = 1; 
  y = x + 1; 
  z = y; 
  t = z - 1; 
  u = t; 
  ...

When analyzing this program without option -obviously-terminates the analyzer immediately recognizes that the states corresponding to the two branches of the if are the same state and merges them into one. Once a single state is left no further expensive inclusion tests are necessary and the analyzer zips through the rest of the program at maximal speed.

With option -obviously-terminates on the other hand the analyzer never tests whether the two branches can be merged. It basically does twice the work propagating each state in duplicate through the rest of the program.

Concluding on option `-obviously-terminates`

When you are thinking of using option -obviously-terminates:

You should wonder whether the analysis will then terminate.
You should wonder whether the analyzed program is more like the first example or more like the second one. If it is more like the first one you may hope that the analysis with what will effectively amount to infinite slevel will be faster.

When an analysis with option -obviously-terminates finishes the results obtained are functionally equivalent to results that would have been obtained with an infinite slevel.

What about the current benchmark?

The value -slevel 100000 the highest one we used also amounts to offering infinite slevel to the analyzer: the analyzer does not consume the 100000 allowed states using up only about half. One last data point we can therefore obtain is the analysis time with option -obviously-terminates in replacement of -slevel 100000.

It turns out that the benchmark is more like the second example above: the analysis takes longer with option -obviously-terminates about half an hour instead of 7 minutes. The option makes having states propagated through the same program point comparatively cheap because inclusion does not have to be tested. On the other hand because inclusion is not tested some program points see 500000 states propagated through them or about ten times more than the maximum when option -obviously-terminates is not used. Although each of them is less expensive to handle having to deal with ten times as many states hurts the bottom line for this benchmark.

Well it was worth trying. Unsurprisingly after the half hour is elapsed the same single alarm is found as in the -slevel 100000 analysis.

Conclusion

The number of options available in the Frama-C platform can be a little bit overwhelming. It is for a good reason that the value analysis tutorial emphasizes option -slevel and this option only. On the example from this post it does not hurt very much only to know about this one: it allows you to obtain the optimal result and you do not seem to be penalized by more than a few seconds for not remembering all the other ones.

Unfortunately the example in this post is only one example and in special cases it may be necessary to have more than one string to one's bow.