A value analysis option to reuse previous function analyses

Pascal Cuoq - 6th Sep 2012

A context-sensitive analysis

Frama-C's value analysis is context-sensitive.

This means that when a function f2() is called from a caller f1() function f2() is analyzed as many times as the analyzer goes over f1(). Function f2() is analyzed each time with a different program state—the program state corresponding to the specific call. If f2() is called from within a loop the analyzer may analyze it many times.

This is for the worthy cause of precision. Any attempt to summarize f2() so that it can be analyzed once and for all can only result in less precise analysis results.

An example

Consider the following program:

int a1 = 1; 
int a2 = 2; 
int a3 = 3; 
int p1 = -1; 
int p2 = -1; 
int p3 = -1; 
void calculate_output(void) 
{ 
  if (a1 == 1 && a2 == 2 && a3 == 3) 
  { 
    a1 = 2; 
    return; 
  } 
  if (a1 == 2 && a2 == 2 && a3 == 3) 
  { 
    a1 = 1; 
    return; 
  } 
} 
main() 
{ 
  calculate_output(); 
  p1 = a1; 
  p2 = a2; 
  p3 = a3; 
  Frama_C_dump_each(); 
  calculate_output(); 
  p1 = a1; 
  p2 = a2; 
  p3 = a3; 
  Frama_C_dump_each(); 
  calculate_output(); 
  p1 = a1; 
  p2 = a2; 
  p3 = a3; 
  Frama_C_dump_each(); 
  calculate_output(); 
} 

For the reason stated above function calculate_output() is analyzed four times when this program is analyzed with the command frama-c -val:

$ frama-c -val calc.c 
... 
[value] computing for function calculate_output <- main. 
        Called from calc.c:25. 
... 
[value] computing for function calculate_output <- main. 
        Called from calc.c:30. 
... 
[value] computing for function calculate_output <- main. 
        Called from calc.c:35. 
... 
[value] computing for function calculate_output <- main. 
        Called from calc.c:40. 
... 
[value] Values at end of function main: 
          a1 ∈ {1} 
          p1 ∈ {2} 
          p2 ∈ {2} 
          p3 ∈ {3} 
... 

An unnamed option

If you look at the program closely you may notice that the third call and the first call (and respectively the fourth and second call) have a lot in common. For each of these pairs the analysis follows the same paths inside the function and change the program state in the same way. This is because although the states differ in variables p1 p2 p3 they are identical for the variables a1 a2 a3 that are the only variables read.

There is an undocumented (and indeed unfinished) value analysis option to take advantage of previous analyses of a function call without loss of precision. It only reuses calls in circumstances similar to those that apply to the third and fourth call to calculate_output() above. When this option is enabled Frama-C after analyzing a function call computes the sets of locations that were effectively read and written for that specific call and records these inputs and outputs as well as the program state before and after the call in a cache for the purpose of later re-using the analysis just done.

With that option the log output during the analysis becomes:

[value] computing for function calculate_output <- main. 
        Called from calc.c:25. 
... 
[value] computing for function calculate_output <- main. 
        Called from calc.c:30. 
... 
calc.c:35:[value] Reusing old results for call to calculate_output 
... 
calc.c:40:[value] Reusing old results for call to calculate_output 
... 
[value] Values at end of function main: 
          a1 ∈ {1} 
          p1 ∈ {2} 
          p2 ∈ {2} 
          p3 ∈ {3} 

History and application to the RERS challenge

The option alluded to in this post was not implemented for the purpose of the RERS 2012 competition. It was implemented earlier in the context of a large case study where it contributed to speed up the analysis greatly. A lot of work would be involved in making this option work well in all contexts (and actually in making sure that derived analyses that plug into the value analysis are not confused by bits of analysis being reused).

Still although it is unfinished the mysterious option is finished enough to be useful in the context of the RERS competition. It is particularly useful when determining the status of liveness properties such as (F oW). Whether infinite execution traces are detected with a counter or with a quantum superposition of all previous states the extra program state introduced by the instrumentation does not influence the behavior of function calculate_output(). In that context it is a huge gain to be able to re-use the analysis of calculate_output() which would otherwise for each state have to be duplicated for all possible predecessors.

Boris Yakobowski invented and implemented the option this post refers to without which the hardest properties in the RERS challenge would be intractable.

Pascal Cuoq
6th Sep 2012