← On arrays vs. pointers Benchmarking static analyzers →

The value analysis propagation order is inscrutable

Pascal Cuoq - 28th Apr 2012

Frama-C has a mailing-list. It's a place people visit for free to complain that they are not getting the quality, technical, detailed answers that they deserve, and to tell us what our priorities should be. I expressed my opinion about the mailing list a long time ago in this very blog. It hasn't gotten any better, and I recently unsubscribed. It is liberating. If you are an unsatisfied user, you should, too.

Every once in a while, there seems to be an interesting message though. My colleagues tell me about these while we wait for the elevator so not only am I not missing anything but in addition they no longer have time to launch discussions about the housing market situation in intra-muros Paris. This is what is called a win-win situation.

The recent interesting question in the mailing list is roughly “why does the value analysis behavior change when I add a statement inside nested loops and how can I get the behavior that gives the best results for what I would like to do?”. So that's two questions.

The answer to the first question is the title of this post. The value analysis works forwards on the control flow graph of the program and when there are arbitrary decisions to take such as in case of a choice which statements to propagate to next it propagates according to its whim. When looking at the control flow graph (and it is necessary to handle arbitrary control flow graphs because embedded code can use goto) the initial structure of the program is much less apparent. The notions of “outer loop” and “inner loop” pretty much disappear.

The propagation order is not specified and may even change between Frama-C versions. The addition of a single statement may also completely change the flow of the analysis inside the program with no apparent reason. Here adding one statement probably causes the analysis to re-analyze the outer loop before having converged on the inner loop causing a slightly inefficient use of the allowed -slevel value—but a large enough -slevel still allows to obtain the precise analysis one wishes.

The answer to the second question is that industrial users with SVN access can already test a new feature that gives users more control over the propagation at the scale of the individual statements. This is in the continuity of the move from a single global -slevel option to -slevel-function options that can be set ahem function by function. Oh and if you are an industrial user and you were still wondering what the new feature was for well David's program is another example where it is useful.

David's goal is apparently to get a precise value for variable k outside the loop. When a single unrelated statement is added in the wrong place it ceases to work:

$ time frama-c ... -slevel 100
...
          k ∈ [34..143]
...
user	0m0.673s

The -slevel option is made less efficient by the additional statement out of bad luck but it still works. You just have to use more of it (and analysis becomes a little bit slower):

$ time frama-c ... -slevel 620
...
          k ∈ {50}
...
user	0m1.911s

Or with the development version and a simple additional annotation inside the source code that could likely be placed automatically:

$ time bin/toplevel.opt ... -slevel 100
...
          k ∈ {50}
...
user	0m1.159s

The latter time is slightly longer but still close to the time it took with Nitrogen without the additional statement that perturbs propagation (0.72s). Actually the difference may well be only the price of handling the additional statement.