Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #161031

Representation of _Bool

From Keith Thompson <Keith.S.Thompson+u@gmail.com>
Newsgroups comp.lang.c
Subject Representation of _Bool
Date 2021-05-23 19:14 -0700
Organization None to speak of
Message-ID <87tums515a.fsf@nosuchdomain.example.com> (permalink)

Show all headers | View raw


As promised, I've studied what the C standard says about the
requirements for the representation of _Bool.  I've referred to the
C11 standard and to drafts of C17 and C2x (N2596).  C11 and C17 do
not differ in this area as far as I can tell, but there are some
new things in the C2x proposal.

An object declared as type _Bool is large enough to store the values
0 and 1.

_Bool is an unsigned integer type.

The rank of _Bool shall be less than the rank of all other standard
integer types.  This implies that the range of values of _Bool is
a subrange of the range of values of unsigned char.  A _Bool object
cannot store a value less than 0 or greater than UCHAR_MAX.

When any scalar value is converted to _Bool, the result is 0 if the
value compares equal to 0; otherwise, the result is 1.  This makes
it difficult, but not impossible, to store a value other than 0
or 1 in a _Bool object, but it can be done (or at least attempted)
via type-punning using a union with _Bool and unsigned char members.

C11 footnote: "While the number of bits in a _Bool object is at least
CHAR_BIT, the width (number of sign and value bits) of a _Bool may be
just 1 bit."  This acknowledges that _Bool *may* have more than one
value bit, and therefore may represent values other than 0 and 1.
N2596 drops the parenthesized clause (probably because _Bool has
no sign bit).

N2596 adds a macro BOOL_WIDTH to <limits.h>, "width for an object
of type _Bool".  It is *at least* 1, implying again that it can
be greater than 1.  (I don't see any implementation that defines
BOOL_WIDTH.)

(N2596 also changes the definitions of false and true in <stdbool.h>
so they're of type _Bool rather than int.  This doesn't affect
representation.)

Conclusions:

sizeof (_Bool) >= 1.  It may be greater than 1, but that would
be weird.  If sizeof (_Bool) > 1, then it must have padding bits.

_Bool has no sign bit.

_Bool has *at least* one value bit.  It may have more, but no more
than CHAR_BIT of them.

The standard allows some variations in how _Bool is represented.
C programmers would be well advised to avoid writing code for which
this matters.

A conforming implementation may do any of the following (I'll assume
for brevity that CHAR_BIT==8):

* _Bool has 8 value bits.  Any value from 0 to 255 inclusive
  is valid.  Storing a value other than 0 or 1 can be done via
  type punning using a union of a _Bool and an unsigned char.

* _Bool has 1 value bit and 7 padding bits, with 254 trap
  representations.  Using type punning to store a value other than
  0 or 1 in a _Bool object, and then accessing that object's value,
  results in undefined behavior.

* _Bool has 1 value bit, 7 padding bits, and no trap representations.
  Since padding bits by definition do not contribute to the value,
  only the value bit's value is relevant.  Using type punning to store
  a value other than 0 or 1 in a _Bool object gives it a value of 0
  if the value is even, 1 if the value is odd.

Other variations are possible (and arguably silly).  For example, _Bool
might have 4 value bits and 4 padding bits, or it might be bigger than
1 byte.  I expect that kind of thing only on the DeathStation 9000.

Here's a small program that attempts to explore how an implementation
represents objects of type _Bool:

#include <stdio.h>
#include <limits.h>

union U {
    _Bool b;
    unsigned char rep;
};

int main(void) {
    union U obj;
    _Bool b;
    for (obj.rep = 0; obj.rep <= 3; obj.rep ++) {
        printf("obj.b = %d, which is %s, obj.rep = %d",
               obj.b, obj.b ? "true " : "false", obj.rep);
        b = obj.b;
        printf(" ... b = %d, which is %s\n", b, b ? "true " : "false");
    }
}

Using gcc 11.1.0, on Ubuntu 20.02 x86_64, I get this output:
obj.b = 0, which is false, obj.rep = 0 ... b = 0, which is false
obj.b = 1, which is true , obj.rep = 1 ... b = 1, which is true
obj.b = 2, which is true , obj.rep = 2 ... b = 2, which is true
obj.b = 3, which is true , obj.rep = 3 ... b = 3, which is true

This mostly looks like _Bool has 8 value bits, but if that were the
case, then I *think* that the value of b would always be 0 or 1.
The rules of simple assignment (b = obj.b) specify that the value
of the right operand is converted to the type of the assignment
expression.  Converting *any* scalar value to _Bool yields 0 or 1,
even if the value is already of type _Bool.  So I conclude that
for gcc, 2 and 3 (and probably anything other than 0 or 1) are
trap representations for _Bool, and that _Bool has 1 value bit,
7 padding bits, and 254 trap representation.

It's possible that the intent is for _Bool to have 8 value bits and the
gcc authors' interpretation of the requirements for simple assignment
differ from mine.  (I won't presume to say who's right.)

Using clang 12.0.0 on the same system, I get:
obj.b = 0, which is false, obj.rep = 0 ... b = 0, which is false
obj.b = 1, which is true , obj.rep = 1 ... b = 1, which is true
obj.b = 0, which is false, obj.rep = 2 ... b = 0, which is false
obj.b = 1, which is true , obj.rep = 3 ... b = 1, which is true

All bits other than the low-order one are ignored.  This is
consistent with _Bool having 1 value bit, 7 padding bits, and no
trap representations.  It's also consistent with 2 and 3 being
trap representations, since that would cause undefined behavior.
It's not consistent with _Bool having more than 1 value bit.

When implementers add support for BOOL_WIDTH, they'll have to decide
explicitly how many value bits _Bool has.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Back to comp.lang.c | Previous | NextNext in thread | Find similar


Thread

Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2021-05-23 19:14 -0700
  Re: Representation of _Bool Philipp Klaus Krause <pkk@spth.de> - 2021-05-25 11:40 +0200
  Re: Representation of _Bool Ben Bacarisse <ben.usenet@bsb.me.uk> - 2021-05-24 12:11 +0100
    Re: Representation of _Bool Richard Damon <Richard@Damon-Family.org> - 2021-05-24 07:43 -0400
      Re: Representation of _Bool Ben Bacarisse <ben.usenet@bsb.me.uk> - 2021-05-24 17:27 +0100
      Re: Representation of _Bool Vir Campestris <vir.campestris@invalid.invalid> - 2021-05-25 21:20 +0100
    Re: Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2021-05-24 13:15 -0700
  Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2021-05-24 06:49 -0700
    Re: Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2021-05-24 13:30 -0700
  Re: Representation of _Bool jacobnavia <jacob@jacob.remcomp.fr> - 2021-05-24 18:40 +0200
    Re: Representation of _Bool David Brown <david.brown@hesbynett.no> - 2021-05-24 18:55 +0200
    Re: Representation of _Bool Bart <bc@freeuk.com> - 2021-05-24 18:31 +0100

csiph-web