Home > Programming > When True Is Not True Anymore

When True Is Not True Anymore

We all know that accessing uninitialized variables in C and C++ usually leads to some kind of undefined behavior one usually wants to avoid. What I didn’t know until recently is that uninitialized bool values might be especially malicious beasts. To see what they can do to you, take a look at the following program, and try to predict its output:

#include <string>
#include <iostream>

using namespace std;

namespace {

    inline string stringify(const bool value)
    {
        return (value ? "true" : "false");
    }

    struct Struct
    {
        long l;
        bool u;
    };
}

int main()
{
    Struct s;

    if(true != s.u)
        cout << stringify(true) << " != " << stringify(s.u) << endl;
}

Now type g++ stringify.cc -o stringify and run the generated executable. Here is what you might see on some platforms:

$ ./stringify 
true != true

Yes, you got that right true != true! I get this behaviour with g++-4.3 and g++-4.4 on Gentoo (x86 and x86_64) as well as with g++-4.1 and g++-4.4 on Ubuntu 9.10 (x86_64). Before attempting to explain what happened here, I want to summarize a few additional facts:

  • Several attempts to make this program shorter without ending up with something boring failed.
  • Turning on optimization causes g++ optimize all if statements away (especially the implicit one in stringify) under the assumption that s.u is false, which again leads to a much more sane output.
  • I could not reproduce this with icc-11.1.

So, what happened? Is g++ broken? Actually the answer is no. Accessing uninitialized memory leads to undefined behavior, and undefined means undefined. In fact the C++0x Final Committee Draft contains a footnote that explicitly mentions the oddity we have just seen:

47) Using a bool value in ways described by this International Standard as “undefined,” such as by examining the value of an
uninitialized automatic object, might cause it to behave as if it is neither true nor false.

This is not that surprising if one considers that at assembler level, a bool is not represented by a single bit, but at least by a byte. An uninitialized byte might have 256 different values, and not just two. One could of course consistently map 0 to false and everything else to true, but this is not what g++ does. To see what I mean, take a look at the following assembler snippet, that g++ generated for the if statement in line 24:

movzbl  -40(%rbp), %eax  # move s.u to eax.
xorl    $1, %eax         # xor eax with 1.
testb	%al, %al         # check if the low byte of eax is 0.
je      .L8              # jump to .L8 if so.

If the jump is taken, the body of the if statement in line 24 is skipped, otherwise it is executed. Now the xorl in line 2 switches the lowest bit in eax, leaving all other bits unchanged. Therefore s.u is considered to be equal to true if and only if it has the byte value 0x01.

Now lets take a look at the assembler that represents the ternary operator in stringify:

cmpb $0, -36(%rbp)    # compare the argument with 0.
je   .L2              # jump to .L2 if the argument is 0.
movl $.LC0, %eax      # store "true" in eax.
jmp  .L3              # jump out.
.L2:
     movl $.LC1, %eax # store "false" in eax.
.L3:

Here g++ maps 0 to false and every other value to true. This means that if the actual byte value of s.u is for example 0xFF (which for some reason is what cgdb keeps telling me locally), the if in line 24 will be taken as if s.u was false, but stringify will behave as if s.u was true.

Advertisements
Categories: Programming Tags:
  1. July 19, 2010 at 05:02

    I think the moral is that you better explicitly initialize your bools to false. Or at least memset the whole thing to zeros.

    But hey. I was duped. I assumed the bool would be 0 to start with. I guess C++ does not guarantee that though.

    • July 19, 2010 at 12:09

      Well, the C++0x Final Committee Draft says:


      To default-initialize an object of type T means:
      — if T is a (possibly cv-qualified) class type (Clause 9), the default
      constructor for T is called (and the initialization is ill-formed if T
      has no accessible default constructor);
      — if T is an array type, each element is default-initialized;
      — otherwise, no initialization is performed.

      […]

      If no initializer is specified for an object, the object is default-initialized; if no initialization is performed, an object with automatic or dynamic storage duration has indeterminate value. [ Note: objects with static or
      thread storage duration are zero-initialized, see 3.6.2. — end note ]

      I think the best way to go is not to think about that too much, but to always initialize fundamental types explicitly before they are used, as suggested by Effective C++ Item 4.

  2. dcsdc
    July 19, 2010 at 09:21

    you’re not supposed to test booleans that way, you’re supposed to just use bool or !bool not compare to true or false.

    • July 19, 2010 at 11:42

      Imagine that the ‘true’ in ‘true != s.u’ is another bool variable, which was in fact the case in draft versions of the example given above. I dropped that to make the code shorter.

    • Tom
      July 19, 2010 at 17:18

      I bet if he did the same test with if(!var) he’d get similarly undefined behavior as well.

      That’s just a common idiom, not a mandatory requirement, testing booleans with == is a perfectly valid way of doing it.

      • July 19, 2010 at 17:48

        You are right indeed; writing ‘!s.u’ (which I would do normally) instead of ‘true != s.u’ changes nothing. The reason for writing ‘true != s.u’ is in parts historical, as pointed out above, and because it better fits with the console output that is done below.

  3. July 19, 2010 at 09:46

    That’s kinda like PHP Typecasting:

    // string literal
    $string = “false”;

    // evaluating string against false should give true
    // but PHP’s typecasting it to a bool so it makes a
    // bool out of string
    if($string == false)
    echo ‘$string is bool(false)’;

    Some languages are seriously kinda fucked up.

    • July 19, 2010 at 15:32

      Well, you are talking about implicit type conversions, which is another, albeit sometimes confusing, topic altogether. Having said that, I don’t believe that C++ is ‘fucked up’ because it doesn’t implicitly zero initialize all variables. If you compile the problematic code from above with ‘g++ -O2 -Wall -Werror whatever.cc -o whatever’ (the ‘-O2’ is important, as otherwise g++ doesn’t analyze the code thoroughly enough to find out that we are accessing uninitialized memory) you get a compile time error as it should be. Unfortunately these settings don’t guard you against accessing uninitialized memory completely, but they help most of the time.

    • DarkCryst
      July 19, 2010 at 22:03

      There is no way in hell that should ever evaluate to false. It’s a set string – a string with any content will always evaluate to true.

      Just because you don’t understand how boolean comparisons and typecasting work don’t blame the language.

      “evaluating string against false should give true” – The only string that will EVER evaluate to false is an empty one.

      • July 20, 2010 at 13:02

        I understand how it works but it’s a common misconception in PHP that is if you want to test for the same type you need to use === instead of == because PHP will cast anything to bool that’s named like a boolean value.

        Example: What if your user answers a question with false? It breaks your script! Why? Because == misteriously tries to typecast it and you should’ve used === instead. Anti-awesomeness guaranteed

  4. Joao
    December 11, 2010 at 07:00

    The problem isn’t with gcc.

    true is the same as 1.
    Boolean values can be anything that fits one byte. So the default value of s.u might be, e.g. 50 which is different then true but evaluates to true in cout.

    If s.u is an even number different than 0, it will be considered false in the comparison.

    try recompiling your code with:
    cout << true << " != " << s.u << endl; //no stringfy

    my output now was:
    1 != 136

    which is right and proves my point.

    • December 12, 2010 at 17:18

      You might want to read my blog post again… “your point” is actually quite similar to “my point”.

  1. October 27, 2010 at 21:31

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: