Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #390070

Re: Representation of _Bool

From Tim Rentsch <tr.17687@z991.linuxsc.com>
Newsgroups comp.lang.c
Subject Re: Representation of _Bool
Date 2025-01-18 12:17 -0800
Organization A noiseless patient Spider
Message-ID <86o7035135.fsf@linuxsc.com> (permalink)
References <87tums515a.fsf@nosuchdomain.example.com> <42fcea7270de500367eceea7ad5530fd@www.novabbs.com> <87ed116s5e.fsf@nosuchdomain.example.com>

Show all headers | View raw


Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> learningcpp1@gmail.com (m137) writes:
>
>> Hi Keith,
>>
>> Thank you for posting this.
>
> The message being referred to is one I posted Sun 2021-05-23, with
> Message-ID <87tums515a.fsf@nosuchdomain.example.com>.  It's visible on
> Google Groups at
> <https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.
>
> As others have suggested, please include attribution information when
> posting a followup.  You don't need to quote the entire message,
> but provide at least some context, particularly when the parent
> message is old.
>
> This is an update to that message.
>
>>                             I noticed that the newer drafts of C23
>> (N2912 onwards, I think) have replaced the term "trap representation"
>> with "non-value representation":
>> - **Trap representation** was last defined in [N2731 3.19.4(1)]
>> (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
>> as "an object representation that need not represent a value of the
>> object type."
>> - **Non-value representation** is most recently defined in
>> [N3435 3.26(1)]
>> (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
>> as "an object representation that does not represent a value of the
>> object type."
>>
>> The definition of non-value representation rules out object
>> representations that represent a value of the object type from
>> being non-value representations.  So it seems to be stricter than
>> the definition of trap representation, which does not seem to rule
>> out such object representations from being trap representations.
>> Is this interpretation correct?
>
> I don't believe so.  As far as I can tell, a "non-value
> representation" (C23 and later) is exactly the same thing as a
> "trap representation" (C17 and earlier).  The older term was
> probably considered unclear, since it could imply that a trap is
> required.  In fact, reading an object with a trap/non-value
> representation has undefined behavior, which can include yielding
> the value you might have expected.
>
>> If so, what happens to the 254 trap representations that GCC and
>> Clang reserve for `_Bool`?
>
> I see no evidence in gcc's documentation that gcc treats
> representations other than 0 or 1 as trap/non-value representations.
> I see only two references to "trap representation", one for signed
> integer types (saying that there are no trap representations) and
> one regarding type-punning via unions.  There are no relevant
> references to "padding bits".
>
> I'm less familiar with clang's documentation, but I see no reference
> to "trap representation" or "non-value representation".
>
> We can get some information about this by running a test program.
> See below.
>
>>                      Assuming a width of 1, each of those 254
>> object representations represents a value in `_Bool`'s domain (the
>> half whose value bit is 1 represents the value `true`, while the
>> other half whose value bit is 0 represents the value `false`), so
>> they cannot be thought of as non-value representations (since a
>> non-value representation must be an object representation that
>> **does not** represent a value of the object type).
>
> Reading an object with a non-value representation has undefined
> behavior.  If the observed value happens to be a valid value of
> the object's type, that's still consistent with undefined
> behavior.  *Everything* is consistent with undefined behavior.
>
>> I've been stuck on this for quite some time, so would be grateful
>> for any guidance you could provide.
>
> Editions of the C standard earlier than C23 were not entirely
> clear about the representation of _Bool.  (C90 does not have _Bool
> or bool.  C99 through C17 have _Bool as a keyword, with bool as
> a macro defined in <stdbool.h>.  C23 has bool as a keyword, with
> _Bool as an alternate spelling.)
>
> In C99 and later, _Bool/bool is required to be an unsigned integer
> type large enough to hold the values 0 and 1.  Its size must be at
> least CHAR_BIT bits (which is at least 8).  The *rank* of _Bool is
> less than the rank of all other standard integer types.
>
> The rank implies that the range of values is a subset of the
> range of values of any other unsigned integer type.  The rank does
> *not* imply anything about relative sizes.  unsigned char has a
> higher rank than bool, but bool could have additional padding bits
> making sizeof(bool)>1.  (Probably no implementation does this.)
> unsigned char has no padding bits.
>
> C11 implies that _Bool can have more than one value bit, which
> means it could represent values greater than 1 (but no more than
> 0..UCHAR_MAX).
>
> C23 (I'm using the N3096 draft) tightens the requirements, saying
> that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
> padding bits -- again implying that sizeof(bool) might be greater
> than 1, but forbidding values greater than 1.
>
> Typically in C17 and earlier, and always in C23, _Bool/bool will
> have exactly 1 value bit and CHAR_BIT-1 padding bits.  Padding bits
> do not contribute to the value of an object (so 0 and 1 are the
> only possible values), but non-zero padding bits *may or may not*
> create trap/non-value representations.  (A gratuitously exotic
> implementation might use a representation other than 00000001 for
> true, but 00000000 is guaranteed to be a representation for 0/false.)
>
> As far as I can tell, the standard is silent on whether a bool object
> with non-zero padding bits is a trap/non-value representation or not.

There are no conditions other than the rules for how integer
types are represented.  As long as those conditions are met an
implementation is free to make any set of object representations
be a trap representation (and I assume that hasn't changed for
C23, not counting the change that the width of _Bool must be
one under C23).

> I wrote a test program to explore how bool is treated.  It uses
> memcpy to set the representation of a bool object and then prints
> the value of that object.  Source is at the bottom of this message.
>
> If bool has no non-value representations, then the values of the
> CHAR_BIT-1 padding bits must be ignored when reading a bool object,
> and the value of such an object is determined only by its single
> value bit, 0 or 1.  If it does have non-value representations,
> then reading such an object has undefined behavior.
>
> With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
> when used in a condition and all other representations are treated
> as true.  Converting the value of a bool object to another integer
> type yields the value of its full 8-bit representation.  If a bool
> object holds a representation other than 00000000 or 00000001,
> it compares equal to both `true` and `false`.
>
> This implies that bool has 1 value bit and 7 padding bits (as
> required by C23) and that it has 2 value representations and 254
> trap representations.  The observed behavior for the non-value
> representations is the result of undefined behavior.  (gcc -std=c23
> sets __STDC_VERSION__ to 202000L, not 202311L.  The documentation
> acknowledges that support for C23 is experimental and incomplete.)
>
> With clang 19.1.4, with "-std=c23", the behavior is consistent
> with bool having no non-value representations.  The 7 padding bits
> do not contribute to the value of a bool object.  Any bool object
> with 0 as the low-order bit is treated as false in a condition and
> yields 0 when converted to another integer type,.  Any bool object
> with 1 as the low-order bit is treated as true, and yields 1 when
> converted to another integer type.  I presume the intent is for bool
> to have 256 value representations and no non-value representations
> (with the padding bits ignored as required), but it's also consistent
> with bool having non-value representations and the observed behavior
> being undefined.  It's not possible to determine with a test program
> whether the output is the result of undefined behavior or not.
>
> As far as I can tell, the question of whether bool has non-value
> representations is unspecified but not implementation-defined,
> meaning that an implementation is not required to document its
> choice.

6.2.6.1 paragraph 2 says objects other than bitfields are composed
of contiguous sequences of one or more bytes, the number, order,
and encoding of which are either explicitly specified or
implementation-defined.  Which object representations are legal
values and which are non-value/trap representations should be
part of the encoding, and hence implementation defined.


> #include <stdio.h>
> #include <string.h>
> #include <limits.h>
> #if __STDC_VERSION__ < 202311L
> #include <stdbool.h>
> #endif
> int main() {
>     printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
> #if __STDC_VERSION__ < 202311L
>     puts("Older than C23, using <stdbool.h>");
> #else
>     puts("C23 or later, using bool directly");
> #endif
>     printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
>            sizeof (unsigned char), sizeof (bool));
>
>     const bool no = false;
>     const bool yes = true;
>     unsigned char uc;
>     memcpy(&uc, &no, 1);
>     printf("false is represented as %d\n", (int)uc);
>     memcpy(&uc, &yes, 1);
>     printf("true  is represented as %d\n", (int)uc);
>
>     for (int i = 0; i <= UCHAR_MAX; i ++) {
>         const unsigned char uc = i;
>         bool b;
>         memcpy(&b, &uc, 1);
>         const unsigned char value = b;
>         printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
>                (unsigned)uc,
>                value,
>                b ? "truthy" : "falsy ",
>                b == false ? "==" : "!=",
>                b == true  ? "==" : "!=");
>     }
> }

I was surprised to discover that running this program (as C11,
under gcc 8.4.0) with the last 'false' changed to 'no' and the
last 'true' changed to 'yes' gave a different result, namely,
except for value==0 and value==1 there were no "==" for the
b comparisons.

Back to comp.lang.c | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-17 02:47 +0000
  Re: Representation of _Bool Kaz Kylheku <643-408-1753@kylheku.com> - 2025-01-17 04:40 +0000
    Re: Representation of _Bool David Brown <david.brown@hesbynett.no> - 2025-01-17 10:18 +0100
    Eternal September server retention Was: Representation of _Bool Michael S <already5chosen@yahoo.com> - 2025-01-17 12:06 +0200
    Re: Representation of _Bool James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-01-17 14:10 -0500
    Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-19 02:08 +0000
      Re: Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-18 18:28 -0800
  Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-17 10:39 -0800
    Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-19 02:11 +0000
      Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-18 20:37 -0800
  Re: Representation of _Bool Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-01-17 13:34 -0800
    Re: Representation of _Bool Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-01-18 12:17 -0800
    Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-19 02:30 +0000
      Re: Representation of _Bool gazelle@shell.xmission.com (Kenny McCormack) - 2025-01-19 09:31 +0000
        Re: Representation of _Bool learningcpp1@gmail.com (m137) - 2025-01-21 00:16 +0000

csiph-web