Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #390068

Re: Representation of _Bool

From Keith Thompson <Keith.S.Thompson+u@gmail.com>
Newsgroups comp.lang.c
Subject Re: Representation of _Bool
Date 2025-01-17 13:34 -0800
Organization None to speak of
Message-ID <87ed116s5e.fsf@nosuchdomain.example.com> (permalink)
References <87tums515a.fsf@nosuchdomain.example.com> <42fcea7270de500367eceea7ad5530fd@www.novabbs.com>

Show all headers | View raw


learningcpp1@gmail.com (m137) writes:
> Hi Keith,
>
> Thank you for posting this.

The message being referred to is one I posted Sun 2021-05-23, with
Message-ID <87tums515a.fsf@nosuchdomain.example.com>.  It's visible on
Google Groups at
<https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

As others have suggested, please include attribution information when
posting a followup.  You don't need to quote the entire message,
but provide at least some context, particularly when the parent
message is old.

This is an update to that message.

>                             I noticed that the newer drafts of C23
> (N2912 onwards, I think) have replaced the term "trap representation"
> with "non-value representation":
> - **Trap representation** was last defined in [N2731
> 3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
> as "an object representation that need not represent a value of the
> object type."
> - **Non-value representation** is most recently defined in [N3435
> 3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
> as "an object representation that does not represent a value of the
> object type."
>
> The definition of non-value representation rules out object
> representations that represent a value of the object type from being
> non-value representations. So it seems to be stricter than the
> definition of trap representation, which does not seem to rule out such
> object representations from being trap representations. Is this
> interpretation correct?

I don't believe so.  As far as I can tell, a "non-value
representation" (C23 and later) is exactly the same thing as a "trap
representation" (C17 and earlier).  The older term was probably
considered unclear, since it could imply that a trap is required.
In fact, reading an object with a trap/non-value representation
has undefined behavior, which can include yielding the value you
might have expected.

> If so, what happens to the 254 trap representations that GCC and Clang
> reserve for `_Bool`?

I see no evidence in gcc's documentation that gcc treats
representations other than 0 or 1 as trap/non-value representations.
I see only two references to "trap representation", one for signed
integer types (saying that there are no trap representations)
and one regarding type-punning via unions.  There are no relevant
references to "padding bits".

I'm less familiar with clang's documentation, but I see no reference
to "trap representation" or "non-value representation".

We can get some information about this by running a test program.
See below.

>                      Assuming a width of 1, each of those 254 object
> representations represents a value in `_Bool`'s domain (the half whose
> value bit is 1 represents the value `true`, while the other half whose
> value bit is 0 represents the value `false`), so they cannot be thought
> of as non-value representations (since a non-value representation must
> be an object representation that **does not** represent a value of the
> object type).

Reading an object with a non-value representation has undefined
behavior.  If the observed value happens to be a valid value of the
object's type, that's still consistent with undefined behavior.
*Everything* is consistent with undefined behavior.

> I've been stuck on this for quite some time, so would be grateful for
> any guidance you could provide.

Editions of the C standard earlier than C23 were not entirely
clear about the representation of _Bool.  (C90 does not have _Bool
or bool.  C99 through C17 have _Bool as a keyword, with bool as
a macro defined in <stdbool.h>.  C23 has bool as a keyword, with
_Bool as an alternate spelling.)

In C99 and later, _Bool/bool is required to be an unsigned integer
type large enough to hold the values 0 and 1.  Its size must be at
least CHAR_BIT bits (which is at least 8).  The *rank* of _Bool is
less than the rank of all other standard integer types.

The rank implies that the range of values is a subset of the
range of values of any other unsigned integer type.  The rank does
*not* imply anything about relative sizes.  unsigned char has a
higher rank than bool, but bool could have additional padding bits
making sizeof(bool)>1.  (Probably no implementation does this.)
unsigned char has no padding bits.

C11 implies that _Bool can have more than one value bit, which
means it could represent values greater than 1 (but no more than
0..UCHAR_MAX).

C23 (I'm using the N3096 draft) tightens the requirements, saying
that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
padding bits -- again implying that sizeof(bool) might be greater
than 1, but forbidding values greater than 1.

Typically in C17 and earlier, and always in C23, _Bool/bool will
have exactly 1 value bit and CHAR_BIT-1 padding bits.  Padding bits
do not contribute to the value of an object (so 0 and 1 are the
only possible values), but non-zero padding bits *may or may not*
create trap/non-value representations.  (A gratuitously exotic
implementation might use a representation other than 00000001 for
true, but 00000000 is guaranteed to be a representation for 0/false.)

As far as I can tell, the standard is silent on whether a bool object
with non-zero padding bits is a trap/non-value representation or not.

I wrote a test program to explore how bool is treated.  It uses
memcpy to set the representation of a bool object and then prints
the value of that object.  Source is at the bottom of this message.

If bool has no non-value representations, then the values of the
CHAR_BIT-1 padding bits must be ignored when reading a bool object,
and the value of such an object is determined only by its single
value bit, 0 or 1.  If it does have non-value representations,
then reading such an object has undefined behavior.

With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
when used in a condition and all other representations are treated
as true.  Converting the value of a bool object to another integer
type yields the value of its full 8-bit representation.  If a bool
object holds a representation other than 00000000 or 00000001,
it compares equal to both `true` and `false`.

This implies that bool has 1 value bit and 7 padding bits (as
required by C23) and that it has 2 value representations and 254
trap representations.  The observed behavior for the non-value
representations is the result of undefined behavior.  (gcc -std=c23
sets __STDC_VERSION__ to 202000L, not 202311L.  The documentation
acknowledges that support for C23 is experimental and incomplete.)

With clang 19.1.4, with "-std=c23", the behavior is consistent
with bool having no non-value representations.  The 7 padding bits
do not contribute to the value of a bool object.  Any bool object
with 0 as the low-order bit is treated as false in a condition and
yields 0 when converted to another integer type,.  Any bool object
with 1 as the low-order bit is treated as true, and yields 1 when
converted to another integer type.  I presume the intent is for bool
to have 256 value representations and no non-value representations
(with the padding bits ignored as required), but it's also consistent
with bool having non-value representations and the observed behavior
being undefined.  It's not possible to determine with a test program
whether the output is the result of undefined behavior or not.

As far as I can tell, the question of whether bool has non-value
representations is unspecified but not implementation-defined,
meaning that an implementation is not required to document its
choice.

#include <stdio.h>
#include <string.h>
#include <limits.h>
#if __STDC_VERSION__ < 202311L
#include <stdbool.h>
#endif
int main() {
    printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
#if __STDC_VERSION__ < 202311L
    puts("Older than C23, using <stdbool.h>");
#else
    puts("C23 or later, using bool directly");
#endif
    printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
           sizeof (unsigned char), sizeof (bool));

    const bool no = false;
    const bool yes = true;
    unsigned char uc;
    memcpy(&uc, &no, 1);
    printf("false is represented as %d\n", (int)uc);
    memcpy(&uc, &yes, 1);
    printf("true  is represented as %d\n", (int)uc);

    for (int i = 0; i <= UCHAR_MAX; i ++) {
        const unsigned char uc = i;
        bool b;
        memcpy(&b, &uc, 1);
        const unsigned char value = b;
        printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
               (unsigned)uc,
               value,
               b ? "truthy" : "falsy ",
               b == false ? "==" : "!=",
               b == true  ? "==" : "!=");
    }
}

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

Back to comp.lang.c | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-17 02:47 +0000
  Re: Representation of _Bool Kaz Kylheku <643-408-1753@kylheku.com> - 2025-01-17 04:40 +0000
    Re: Representation of _Bool David Brown <david.brown@hesbynett.no> - 2025-01-17 10:18 +0100
    Eternal September server retention Was: Representation of _Bool Michael S <already5chosen@yahoo.com> - 2025-01-17 12:06 +0200
    Re: Representation of _Bool James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-01-17 14:10 -0500
    Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-19 02:08 +0000
      Re: Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-18 18:28 -0800
  Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-17 10:39 -0800
    Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-19 02:11 +0000
      Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-18 20:37 -0800
  Re: Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-17 13:34 -0800
    Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-18 12:17 -0800
    Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-19 02:30 +0000
      Re: Representation of _Bool gazelle@shell.xmission.com (Kenny McCormack) - 2025-01-19 09:31 +0000
        Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-21 00:16 +0000

csiph-web