Truthiness in C

This week someone mentioned that C supports casting from int to bool. That naturally triggered my curiosity – what does the generated code look like?

First I think it’s important to point out many casts in C are “free”. “Free” in the sense that the compiler changes its internal understanding of an expression. But in actual assembly nothing really changes – registers don’t have types. (NB: architecture dependent)

For example, consider:

char a(int x)
    return (char)x;

int b(char x)
    return (int)x;

char *c(int *x)
    return (char *)x;

The generated code looks like:

        mov     eax, edi
        movsx   eax, dil
        mov     rax, rdi

Clearly we are just moving values between the argument register and the return register. So the cast is “free” (ignoring the sign extension in b).

But what about casting from int to bool?

bool d(int x)
    return (bool)x;

Well, the compiler gives us:

        test    edi, edi
        setne   al

We see that it generates code to first test eax and then to setne al. test is used to set the status flags. setne is used to set the lowest 8 bit subregister in rax. rax is used to return integer values from a function in the System V ABI.

Note that the top 56 bits in rax are not zeroed – they contain junk. This is fine b/c the compiler will only make callers check the lowest bit of a register for boolean operations. This is why changing the compiler’s “understanding” (ie the cast) is necessary.

So, two otherwise extra instructions. Not too bad for how useful it is.

See the full code here.


Thanks thxg on HN for pointing out that I was incorrectly compiling with -m32 but talking about x86-64.

And also to AshamedCaptain for pointing out the x86 centered assumptions.