Solution to yesterday's quiz

Pascal Cuoq - 29th Nov 2012

Yesterday's quiz was about the expression *(char*)(float[]){x*x} - 63 (for big-endian architectures) or *(3+(char*)(float[]){x*x}) - 63 (for little-endian ones). This post provides an explanation.

First, let us try the function on a few values:

int main(){ 
  for (unsigned int i=0; i<=20; i++) 
    printf(\l(%2u)=%d"  i  l(i)); 
}

This may provide the beginning of a hint:

l( 0)=-63 
l( 1)=0 
l( 2)=1 
l( 3)=2 
l( 4)=2 
l( 5)=2 
l( 6)=3 
l( 7)=3 
l( 8)=3 
l( 9)=3 
l(10)=3 
l(11)=3 
l(12)=4 
l(13)=4 
l(14)=4 
l(15)=4 
l(16)=4 
l(17)=4 
l(18)=4 
l(19)=4 
l(20)=4

The construct (float[]){…} is C99's syntax for anonymous arrays a kickass programming technique. This is an unabated quote.

In the case at hand the construct converts to float the contents of the braces and puts the result in memory. The function puts the float in memory in order to read its most significant byte. That's *(char*)… on a big-endian architecture and *(3+(char*)…) on a little-endian one.

One reason to read a single char is to circumvent strict aliasing rules—which do not apply to type char. A simpler version of the same function would have been (*(int*)(float[]){x} >> 23) - 127 but that version would break strict aliasing rules. Also it would be too obvious.

The most significant bits of a single-precision IEEE 754 floating-point representation are in order one sign bit and eight exponent bits. By reading the most significant byte we get most of the exponent but one bit is lost. To compensate for this the operation is applied to x*x whose exponent is double the exponent of x.

In conclusion yesterday's one-liner returns an integer approximation of the base-2 logarithm of a reasonably small unsigned int x. On a typical 32-bit architecture it is exact for powers of two up to 2¹⁵. If x is zero the function returns its best approximation of -∞ that is -63.