New alarm category

Pascal Cuoq - 26th Sep 2011

There won't be many changes in the value analysis' documentation in Nitrogen. For lack of time, the new options, some of which were alluded to in this blog, will remain in their \to be documented" state. But documenting them more fully can always be done here too once they are released.

The one big change in the value analysis manual follows. It is a section describing a new kind of alarm and inserts itself in section 3.2.2 "Log messages emitted by the value analysis — Proof obligations".

Partially overlapping lvalue assignment

Vaguely related to but different from undefined side-effects in expressions the value analysis warns about the following program:

struct S { int a; int b; int c; }; 
struct T { int p; struct S s; }; 
union U { struct S s; struct T t; } u; 
void copy(struct S *p  struct S *q) 
{ 
  *p = *q; 
} 
int main(int c  char **v){ 
  u.s.b = 1; 
  copy(&u.t.s  &u.s); 
  return u.t.s.a + u.t.s.b + u.t.s.c; 
} 

The programmer thought ey was invoking implementation-defined behavior in the above program using an union to type-pun between structs S and T. Unfortunately this program returns 1 when compiled with clang -m32; it returns 2 when compiled with clang -m32 -O2 and it returns 0 when compiled with gcc -m32.

For a program as simple as the above all these compilers are supposed to implement the same implementation-defined choices. Which compiler if we may ask such a rhetorical question is right? They all are because the program is undefined. When function copy() is called from main() the assignment *p = *q; breaks C99's 6.5.16.1:3 rule. This rule states that in an assignment from lvalue to lvalue the left and right lvalues must overlap either exactly or not at all.

The program breaking this rule means compilers neither have to emit warnings (none of the above did) nor produce code that does what the programmer intended whatever that was. Launched on the above program the value analysis says:

... Partially overlapping lvalue assignment "*p=*q;". 
Left address in bits: {{ u -> {32} }}. 
Right address in bits: {{ u -> {0} }}. 
assert(separated or same) 

Acknowledgments and moral of the story

I first heard about the 6.5.16.1:3 rule from Chucky Ellison and he had himself heard about it from Derek M. Jones. Later the Csmith random program generator generated a program that computed differently when compiled with Clang and when interpreted by the value analysis (and indeed that computed differently with different Clang optimization levels) all because of this rule. This is technically a bug in (that development version of) Csmith which is supposed to generate defined C programs. But it successfully turned my hate of false positives against my laziness and forced me to implement detection for programs that break this rule.

Pascal Cuoq
26th Sep 2011