Portable modulo signed arithmetic operations

Pascal Cuoq - 26th Feb 2013

In a recent post, John Regehr reports on undefined behaviors in Coq! In Coq! Coq! Exclamation point!

Coq is a proof assistant. It looks over your shoulder and tells you what remains to be done while you are elaborating a mathematical proof. Coq has been used to check that the 4-color theorem really holds. It would be supremely ironic if an undefined behavior in a bit of C code in the OCaml implementation used to compile Coq caused it to accept as valid some incorrect proofs for instance in the formally verified (in Coq) compiler CompCert.

Your mission should you decide to accept it

Where are these undefined behaviors? They are in several places but some of them are in the implementation of the bytecode arithmetic operations. In OCaml basic operations such as addition and multiplication are specified as returning the mathematical result modulo 2^31 2^32 2^63 or 2^64 depending on the compilation platform and the integer type.

The problem here is not limited to OCaml. The Java Virtual Machine too specifies that arithmetic operations on its 32-bit integer type return results modulo 2^32. Someone wishing to implement a JVM in portable C would face the same issues.

In fact this is just such a big issue that we should collectively try to work out a solution now.

A safer signed addition instruction

Let us assume a 32-bit platform. In this case values of the unboxed OCaml type int range between -2^30 and 2^30-1 (31 bits of information). They are stored at run-time in the 31 most significant bits of a machine word with the least significant bit set to one to distinguish it from a pointer. This representation is taken into account when applying arithmetic operations to machine words representing unboxed integers.

Here is an example in file interp.c in the OCaml implementation. We are in the middle of the large switch that decodes bytecode instructions:

    Instruct(ADDINT): 
      accu = (value)((intnat) accu + (intnat) *sp++ - 1); Next; 

The - 1 in this expression compensates for the fact that the least significant bit is set in both operands and we want it to be set in the result (and no carry to be propagated). But as written there is an undefined overflow in the C implementation of the bytecode interpreter when the unboxed integers in accu and *sp total more than 2^30-1. In order to avoid a signed overflow here the instruction ADDINT could be written:

      accu = (value)((uintnat) accu + (uintnat) *sp++ - 1U); Next; 

This solution takes advantage of the fact that unlike signed addition the behavior of unsigned addition when the result overflows the destination type is defined by the C standard. The solution above does make assumptions about the conversion from a signed to an unsigned type and vice versa but these are implementation-defined (and we are only targeting implementations that define these conversions as 2's-complement).

No overhead is involved in this approach: a half-decent C compiler should generate the same instruction sequence for the new version as it already generates for the old version.

Multiplication

You may be vaguely worried about multiplication. For instance you may remember that while in a typical instruction set there is only one addition instruction that works for both signed and unsigned operands there are dedicated signed multiplication and unsigned multiplication instructions. These are IMUL and MUL in the x86 family of processors.

Fear not! The two multiplications are only because on x86 and many other processors 32-bit multiplication returns a 64-bit result. If you want all 64 bits of the results to be what you expect you need to choose the instruction that interprets the operands' highest bit correctly. But if on the other hand you only care for the lowest 32 bits of the result (that is you only want a result modulo 2^32) then it does no matter that you use unsigned multiplication. Think of it as (x + k1 * 2^32) * (y + k2 * 2^32) for unknown k1 and k2.

And this is our case. Thus in interp.c the implementation for bytecode instruction MULINT could be changed from:

Instruct(MULINT): 
  accu = Val_long(Long_val(accu) * Long_val(*sp++)); Next; 

to:

Instruct(MULINT): 
  accu = Val_long((intnat)((uintnat)Long_val(accu) * (uintnat)Long_val(*sp++))); Next; 

Conclusion

As pointed out by John Regehr in the comments the two fixes above are appropriate because the OCaml language definition specifies wrap-around behavior for its (signed) arithmetic operations. One should not assume that all the warnings emitted by IOC have a similar cause. Some may be deeper bugs requiring the logic to be fixed instead of conversions to unsigned types that only ensure wrap-around behavior.

Issues of the nature discussed here should be in the bytecode version of the OCaml compiler only. By contrast the native compiler should translate OCaml-level multiplication directly to assembly multiplication bypassing the portable^Wwoefully underspecified assembly that is the C language.

Even programs generated by the native compiler interface with bits of OCaml runtime written in C but this runtime is small.

So actually there is not much cause for worry. If you are using a C compiler or a static analysis platform that involve OCaml code on a modern computer they are probably already compiled with the native OCaml compiler.

Pascal Cuoq
26th Feb 2013