Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.c > #391294 > unrolled thread

Bart's Language

Started bybart <bc@freeuk.com>
First post2025-03-17 23:51 +0000
Last post2025-03-21 00:33 +0000
Articles 20 on this page of 62 — 13 participants

Back to article view | Back to comp.lang.c


Contents

  Bart's Language bart <bc@freeuk.com> - 2025-03-17 23:51 +0000
    Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-18 12:17 +0000
      Re: Bart's Language bart <bc@freeuk.com> - 2025-03-18 13:54 +0000
        Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-18 15:10 +0000
          Re: Bart's Language bart <bc@freeuk.com> - 2025-03-18 15:45 +0000
            Re: Bart's Language David Brown <david.brown@hesbynett.no> - 2025-03-18 17:31 +0100
              int a = a (Was: Bart's Language) gazelle@shell.xmission.com (Kenny McCormack) - 2025-03-18 18:04 +0000
                Re: int a = a (Was: Bart's Language) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-03-18 19:36 +0100
                  Re: int a = a (Was: Bart's Language) Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-18 19:11 +0000
                  Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-19 15:56 +0100
                    Re: int a = a (Was: Bart's Language) scott@slp53.sl.home (Scott Lurndal) - 2025-03-19 16:38 +0000
                      Re: int a = a (Was: Bart's Language) "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-03-19 14:29 -0700
                        Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-20 09:39 +0100
                          Re: int a = a (Was: Bart's Language) bart <bc@freeuk.com> - 2025-03-20 11:59 +0000
                            Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-20 15:46 +0100
                              Re: int a = a (Was: Bart's Language) wij <wyniijj5@gmail.com> - 2025-03-20 23:13 +0800
                      Re: int a = a (Was: Bart's Language) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-20 02:02 -0700
                        Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-20 15:57 +0100
                    Re: int a = a (Was: Bart's Language) Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-19 17:07 +0000
                    Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-19 13:34 -0700
                      Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-20 02:54 -0700
                        Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-20 03:20 -0700
                          Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-20 16:22 +0100
                            Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-20 12:46 -0700
                              Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-21 10:44 +0100
                                Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-21 12:23 -0700
                                  Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-21 21:46 +0100
                                  Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-22 13:59 -0700
                                    Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-22 15:37 -0700
                                      Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-28 09:39 -0700
                          Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-29 13:12 -0700
                            Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-04-29 13:34 -0700
                      Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-20 15:42 +0100
                Re: int a = a (Was: Bart's Language) scott@slp53.sl.home (Scott Lurndal) - 2025-03-18 19:37 +0000
                Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-18 20:51 +0100
                  Re: int a = a (Was: Bart's Language) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-03-18 23:27 +0100
                    Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-19 11:40 +0100
                Re: int a = a (Was: Bart's Language) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-18 23:52 -0700
                  Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-19 01:55 -0700
                    Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-27 13:41 -0700
                  Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-19 11:43 +0100
                  Re: int a = a (Was: Bart's Language) Rosario19 <Ros@invalid.invalid> - 2025-03-19 13:23 +0100
                    Re: int a = a (Was: Bart's Language) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-20 01:32 -0700
            Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-20 22:55 +0000
              Re: Bart's Language Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-20 16:22 -0700
                Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-22 14:37 +0000
                  Re: Bart's Language James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-03-22 11:41 -0400
                    Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-22 16:52 +0000
                      Re: Bart's Language James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-03-22 20:12 -0400
                    By definition... (Was: Bart's Language) gazelle@shell.xmission.com (Kenny McCormack) - 2025-03-23 17:20 +0000
                Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-27 11:53 -0700
                  Re: Bart's Language Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-04-27 14:29 -0700
                    Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-01-06 14:04 -0800
                      Re: Bart's Language Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2026-01-06 17:12 -0800
                        Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-03-06 09:04 -0800
          Re: Bart's Language bart <bc@freeuk.com> - 2025-03-18 22:19 +0000
            Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-20 22:38 +0000
              Re: Bart's Language Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-20 23:45 +0000
                Re: Bart's Language bart <bc@freeuk.com> - 2025-03-21 00:56 +0000
                  Re: Bart's Language Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-21 17:47 +0000
                    Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-22 07:12 -0700
              Re: Bart's Language bart <bc@freeuk.com> - 2025-03-21 00:33 +0000

Page 2 of 4 — ← Prev page 1 [2] 3 4  Next page →


#391390 — Re: int a = a

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2025-03-20 02:54 -0700
SubjectRe: int a = a
Message-ID<86zfhgni2a.fsf@linuxsc.com>
In reply to#391374
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

[how to indicate a variable not being used is okay]
[some quoted text rearranged]

> Unless I'm missing something, `(void)x` also has undefined beahvior
> if x is uninitialized,

Right.  Using (void)&x is better.

> though it's very likely to do nothing in practice.

Unless x is volatile qualified, in which there must be an access
to x in the generated code.

> The behavior [of int a = a;] is undefined.  In C11 and later
> (N1570 6.3.2.1p2):
>
>     Except when [...] an lvalue that does not have array type is
>     converted to the value stored in the designated object (and is
>     no longer an lvalue); this is called lvalue conversion.
>     [...]
>     If the lvalue designates an object of automatic storage
>     duration that could have been declared with the register
>     storage class (never had its address taken), and that object
>     is uninitialized (not declared with an initializer and no
>     assignment to it has been performed prior to use), the
>     behavior is undefined.

> Long digression follows.
>
> The "could have been declared with the register storage class"
> seems quite odd.  And in fact it is quite odd.

I don't have the same reaction.  The point of this phrase is that
undefined behavior occurs only for variables that don't have
their address taken.  The phrase used describes that nicely.
Any questions related to "registerness" can be ignored, because
'register' in C really has nothing to do with hardware registers,
despite the name.

> It's tempting to assume that `int n = n;` did not have undefined
> behavior prior to C11, or that accessing an automatic object whose
> address has not been taken does not have undefined behavior even
> in C11 or later, but it's not that simple.
>
> In C90, the non-normative Annex G (renamed to Annex J in later
> editions) says:
>
>     The behavior in the following circumstances is undefined:
>     [...]
>     - The value of an uninitialized object that has automatic storage
>       duration is used before a value is assigned (6.5.7).
>
> 6.5.7 discusses initialization, and says that "If an object that
> has automatic storage duration is not initialized explicitly, its
> value is indeterminate", and C90's definition of "undefined behavior"
> explicitly refers to use of indeterminately valued objects, though
> it's not 100% clear that using an indeterminate value *always*
> has undefined behavior.
>
> So in C90, `int n = n;` explicitly had undefined behavior, even if
> all possible bit representations for an object of type int correspond
> to valid values (C90 didn't mention "trap representations").
>
> C99 added a definition for "indeterminate value":  "either an
> unspecified value or a trap representation", and drops the mention
> of indeterminate values in the definition of "undefined behavior".
> It dropped the reference to uninitialized objects in Annex G/J.
> I believe that in C99, `int n = n;` is well defined *if* int
> has no trap representations, or if the representation stored in
> the memory occupied by n happens not to be a trap representation.
> If int has trap representations, and that memory happens to contain
> such a representation, the behavior is undefined.
>
> I found a discussion in comp.std.c from 2023, subject "Does reading
> an uninitialized object have undefined behavior?".
>
> The discontinued IA-64/Itanium processor had something called
> "NaT", "Not a Thing".  NaT representations exist only in CPU
> registers, not in memory.  (Imagine an extra bit for each register
> indicating whether the register contains a "thing".)  A NaT allows
> for representations that act like C trap representations (called
> non-value representations in C23) even for types with no trap
> representations (for example where all 2**N possible representations
> correspond to valid values) -- but again, only in CPU registers.
>
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
>
> So the "could have been declared with the register storage class"
> wording was added in C11 specifically to cater to the IA64.  This
> change would have been superfluous in C90, where the behavior was
> undefined anyway, but is a semantically significant change between
> C99 and C11.  (If some future CPU has something like NaT that can
> be stored in memory, the wording might need to be updated yet again.)
>
> My takeaway is that if it requires this much research to determine
> whether accessing the value of an uninitialized object has undefined
> behavior (in which circumstances and which edition of the standard),
> I'll just avoid doing so altogether.  I'll initialize objects
> when they're defined whenever practical.  If it's not practical
> for some reason, I won't initialize it with some dummy value;  I'll
> leave it uninitialized so the compiler has a chance to warn me if
> I accidentally use it before assigning a value to it.

I think you are overthinking the question.  In cases where it's
important to give an initial value to a variable, and can be done
so at the point of its declaration, use an initializer;  otherwise
don't.  We don't have to read several different C standards, or
even only one, to reach that conclusion.  If someone wants to know
exactly which border cases are safe and which cases are not, then
reading the relevant version(s) of the C standard is needed, but
in most situations it isn't.  It's important for the C standard to
be precise about what it prescribes, but as far as initialization
goes it's easy to write code that doesn't need that level of
detail.  Compiler writers need to know such things;  in the
particular case of when and where to initialize, most developers
don't.

[toc] | [prev] | [next] | [standalone]


#391391 — Re: int a = a

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2025-03-20 03:20 -0700
SubjectRe: int a = a
Message-ID<87cyect356.fsf@nosuchdomain.example.com>
In reply to#391390
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
> [how to indicate a variable not being used is okay]
> [some quoted text rearranged]
>
>> Unless I'm missing something, `(void)x` also has undefined beahvior
>> if x is uninitialized,
>
> Right.  Using (void)&x is better.

I'm not convinced -- and it's far less idiomatic.  I don't think
I've ever seen (void)&x in code, and if I did I'd wonder what the
author's intent was.

(void)x is a common idiom for hinting to the compiler that it
doesn't need to complain about x being unused.  (void)&x doesn't
tell the compiler that the *value* of x is used.  I'm not sure how
much difference that makes.

Even with (void)x and/or (void)&x, a compiler *could* still warn
about x being unused, or about the programmer's use of an ugly font.

>> though it's very likely to do nothing in practice.
>
> Unless x is volatile qualified, in which there must be an access
> to x in the generated code.
>
>> The behavior [of int a = a;] is undefined.  In C11 and later
>> (N1570 6.3.2.1p2):
>>
>>     Except when [...] an lvalue that does not have array type is
>>     converted to the value stored in the designated object (and is
>>     no longer an lvalue); this is called lvalue conversion.
>>     [...]
>>     If the lvalue designates an object of automatic storage
>>     duration that could have been declared with the register
>>     storage class (never had its address taken), and that object
>>     is uninitialized (not declared with an initializer and no
>>     assignment to it has been performed prior to use), the
>>     behavior is undefined.
>
>> Long digression follows.
>>
>> The "could have been declared with the register storage class"
>> seems quite odd.  And in fact it is quite odd.
>
> I don't have the same reaction.  The point of this phrase is that
> undefined behavior occurs only for variables that don't have
> their address taken.  The phrase used describes that nicely.
> Any questions related to "registerness" can be ignored, because
> 'register' in C really has nothing to do with hardware registers,
> despite the name.

DR 338 is explicitly motivated by an IA-64 feature that applies only to
CPU registers.  An object whose address is taken can't be stored (only)
in a register, so it can't have a NaT representation.

The phrase used is "could have been declared with register storage class
(never had its address taken)".  Surely "never had its address taken"
would have been clear enough if CPU registers weren't a big part of the
motivation.

[SNIP]

>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
>> 
>> So the "could have been declared with the register storage class"
>> wording was added in C11 specifically to cater to the IA64.  This
>> change would have been superfluous in C90, where the behavior was
>> undefined anyway, but is a semantically significant change between
>> C99 and C11.  (If some future CPU has something like NaT that can
>> be stored in memory, the wording might need to be updated yet again.)
>>
>> My takeaway is that if it requires this much research to determine
>> whether accessing the value of an uninitialized object has undefined
>> behavior (in which circumstances and which edition of the standard),
>> I'll just avoid doing so altogether.  I'll initialize objects
>> when they're defined whenever practical.  If it's not practical
>> for some reason, I won't initialize it with some dummy value; I'll
>> leave it uninitialized so the compiler has a chance to warn me if
>> I accidentally use it before assigning a value to it.
>
> I think you are overthinking the question.  In cases where it's
> important to give an initial value to a variable, and can be done
> so at the point of its declaration, use an initializer;  otherwise
> don't.

My overthinking led me to essentially the same conclusion, so I don't
see the problem.  And I also found it to be an interesting exploration
of how certain aspects of the C standard have evolved over time.

>         We don't have to read several different C standards, or
> even only one, to reach that conclusion.

No, but we do have to read one or more C standards to counter an
argument that `int a = a;` is well defined.

>                                           If someone wants to know
> exactly which border cases are safe and which cases are not, then
> reading the relevant version(s) of the C standard is needed, but
> in most situations it isn't.  It's important for the C standard to
> be precise about what it prescribes, but as far as initialization
> goes it's easy to write code that doesn't need that level of
> detail.  Compiler writers need to know such things;  in the
> particular case of when and where to initialize, most developers
> don't.

Most developers don't read this newsgroup.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#391414 — Re: int a = a

FromDavid Brown <david.brown@hesbynett.no>
Date2025-03-20 16:22 +0100
SubjectRe: int a = a
Message-ID<vrhbsf$3e7sn$4@dont-email.me>
In reply to#391391
On 20/03/2025 11:20, Keith Thompson wrote:
> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

>>>
>>> The "could have been declared with the register storage class"
>>> seems quite odd.  And in fact it is quite odd.
>>
>> I don't have the same reaction.  The point of this phrase is that
>> undefined behavior occurs only for variables that don't have
>> their address taken.  The phrase used describes that nicely.
>> Any questions related to "registerness" can be ignored, because
>> 'register' in C really has nothing to do with hardware registers,
>> despite the name.
> 
> DR 338 is explicitly motivated by an IA-64 feature that applies only to
> CPU registers.  An object whose address is taken can't be stored (only)
> in a register, so it can't have a NaT representation.
> 
> The phrase used is "could have been declared with register storage class
> (never had its address taken)".  Surely "never had its address taken"
> would have been clear enough if CPU registers weren't a big part of the
> motivation.
> 

I too think the phrasing is a bit odd.

Just because a variable's address is taken, does not mean it cannot be 
put in a cpu register by the compiler.  If the variable is not accessed 
in a way that actually requires putting it in memory, then the compiler 
can put it in a cpu register (or otherwise optimise it).  So simply 
taking the address of a variable on IA-64 does not mean it cannot be in 
a register, and thus does not necessarily mean it cannot be NaT.  Taking 
the address of a variable means the variable cannot be declared 
"register", but it does not mean it cannot be /in/ a register.

It seems very strange to me that this is UB:

	int foo1(void) {
		int x;

		return x;
	}

while this is not :

	int foo2(void) {
		int x;

		int * p = &x;

		return x;
	}

(Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler in 
its list.)

It strikes me that it would have been far simpler for the standard 
simply to say that using the value of an uninitialised and unassigned 
variable is undefined behaviour.

[toc] | [prev] | [next] | [standalone]


#391438 — Re: int a = a

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2025-03-20 12:46 -0700
SubjectRe: int a = a
Message-ID<87msdfscxj.fsf@nosuchdomain.example.com>
In reply to#391414
David Brown <david.brown@hesbynett.no> writes:
> On 20/03/2025 11:20, Keith Thompson wrote:
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>> The "could have been declared with the register storage class"
>>>> seems quite odd.  And in fact it is quite odd.
>>>
>>> I don't have the same reaction.  The point of this phrase is that
>>> undefined behavior occurs only for variables that don't have
>>> their address taken.  The phrase used describes that nicely.
>>> Any questions related to "registerness" can be ignored, because
>>> 'register' in C really has nothing to do with hardware registers,
>>> despite the name.
>> DR 338 is explicitly motivated by an IA-64 feature that applies only
>> to
>> CPU registers.  An object whose address is taken can't be stored (only)
>> in a register, so it can't have a NaT representation.
>> The phrase used is "could have been declared with register storage
>> class
>> (never had its address taken)".  Surely "never had its address taken"
>> would have been clear enough if CPU registers weren't a big part of the
>> motivation.
>
> I too think the phrasing is a bit odd.
>
> Just because a variable's address is taken, does not mean it cannot be
> put in a cpu register by the compiler.  If the variable is not
> accessed in a way that actually requires putting it in memory, then
> the compiler can put it in a cpu register (or otherwise optimise it).
> So simply taking the address of a variable on IA-64 does not mean it
> cannot be in a register, and thus does not necessarily mean it cannot
> be NaT.  Taking the address of a variable means the variable cannot be
> declared "register", but it does not mean it cannot be /in/ a
> register.

Sure, any variable that's stored in memory can be mirrored by holding
its value in a register.

    int n = 42; // Assume n is assigned a memory address
    printf("n+1=%d n+2=%d\n", n+1, n+2);

A compiler could plausibly store the value of n in a register before
computing n+1, and then reuse the register value to compute n+2.

My understanding is that IA-64 NaT (Not a Thing) representations
exist only for registers, and the NaT bit should be cleared when
a value is stored in the register.

The odd wording in the standard allows an IA-64 C compiler to
take advantage of NaT representations for their intended purpose.
It might impose some minor constraints on what machine code can be
generated, but *most* of the cases where a NaT could be accessed
are undefined behavior in C.

> It seems very strange to me that this is UB:
>
> 	int foo1(void) {
> 		int x;
>
> 		return x;
> 	}
>
> while this is not :
>
> 	int foo2(void) {
> 		int x;
>
> 		int * p = &x;
>
> 		return x;
> 	}
>
> (Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler
> in its list.)
>
> It strikes me that it would have been far simpler for the standard
> simply to say that using the value of an uninitialised and unassigned
> variable is undefined behaviour.

In C90, it was.  C99 changed that, making the behavior defined if the
representation is not a trap representation.

For C99, a conforming IA-64 C compiler would have had to go out of its
way to avoid accessing NaT representations.  For example, if you wrote

    {
        int n;
        n;
    }

the most straightforward IA-64 code would store n in a register and
not initialize it, resulting in a trap when the register is read.
A compiler might have to generate code to store an arbitrary value
in the register to void the trap.

I'm undecided on whether reading the value of an uninitialized
automatic object *should* be undefined behavior, but given that
it isn't, the C11 committee made the smallest possible change to
cater to IA-64 semantics.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#391471 — Re: int a = a

FromDavid Brown <david.brown@hesbynett.no>
Date2025-03-21 10:44 +0100
SubjectRe: int a = a
Message-ID<vrjcd5$18m5n$1@dont-email.me>
In reply to#391438
On 20/03/2025 20:46, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 20/03/2025 11:20, Keith Thompson wrote:
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>>> The "could have been declared with the register storage class"
>>>>> seems quite odd.  And in fact it is quite odd.
>>>>
>>>> I don't have the same reaction.  The point of this phrase is that
>>>> undefined behavior occurs only for variables that don't have
>>>> their address taken.  The phrase used describes that nicely.
>>>> Any questions related to "registerness" can be ignored, because
>>>> 'register' in C really has nothing to do with hardware registers,
>>>> despite the name.
>>> DR 338 is explicitly motivated by an IA-64 feature that applies only
>>> to
>>> CPU registers.  An object whose address is taken can't be stored (only)
>>> in a register, so it can't have a NaT representation.
>>> The phrase used is "could have been declared with register storage
>>> class
>>> (never had its address taken)".  Surely "never had its address taken"
>>> would have been clear enough if CPU registers weren't a big part of the
>>> motivation.
>>
>> I too think the phrasing is a bit odd.
>>
>> Just because a variable's address is taken, does not mean it cannot be
>> put in a cpu register by the compiler.  If the variable is not
>> accessed in a way that actually requires putting it in memory, then
>> the compiler can put it in a cpu register (or otherwise optimise it).
>> So simply taking the address of a variable on IA-64 does not mean it
>> cannot be in a register, and thus does not necessarily mean it cannot
>> be NaT.  Taking the address of a variable means the variable cannot be
>> declared "register", but it does not mean it cannot be /in/ a
>> register.
> 
> Sure, any variable that's stored in memory can be mirrored by holding
> its value in a register.
> 
>      int n = 42; // Assume n is assigned a memory address
>      printf("n+1=%d n+2=%d\n", n+1, n+2);
> 
> A compiler could plausibly store the value of n in a register before
> computing n+1, and then reuse the register value to compute n+2.

Yes, of course.  But there is also no necessity for variables to be in 
memory at all, or that there is any consistency there.  "Assume n is 
assigned a memory address" is a completely unwarranted assumption for 
almost all local variables.  It is only if the address is taken, and 
used in some way that is beyond the optimiser, that the variable 
actually has to go in a fixed place in memory.  Otherwise optimisers can 
and do keep data in registers, or move them in and out of registers and 
different stack slots according to convenience for efficient code.


uint32_t float_to_uint(float f) {
     uint32_t u;
     memcpy(&u, &f, 4);
     return u;
}

gcc compiles that to :

float_to_uint:
         movd    eax, xmm0
         ret

So even though the addresses of the variable "u" and the parameter "f" 
are taken, and converted to char pointers, and passed to a function with 
external linkage, nothing is actually put in memory at all.

Thus the standard's wording as though the legality of using the 
"register" storage-class specifier corresponds to cpu register usage is, 
at best, wildly out of date.

(And there are some architectures where the cpu registers are directly 
mapped to memory, and can be accessed as memory locations or registers.)

> 
> My understanding is that IA-64 NaT (Not a Thing) representations
> exist only for registers, and the NaT bit should be cleared when
> a value is stored in the register.
> 
> The odd wording in the standard allows an IA-64 C compiler to
> take advantage of NaT representations for their intended purpose.
> It might impose some minor constraints on what machine code can be
> generated, but *most* of the cases where a NaT could be accessed
> are undefined behavior in C.
> 

I see that, but I believe it would be much simpler and clearer if 
attempting to read an uninitialised and unassigned local variable were 
undefined behaviour in every case.

Alternatively, it could have said that the value is unspecified in every 
case.  Then on the IA-64, the compiler would have to ensure that 
registers do not have their NaT bit set even if they are not initialised 
- this would not be a difficult task.  Enabling use of the NaT bit for 
detection of bugs could then be a compiler option if implementations 
wanted to provide that feature.

>> It seems very strange to me that this is UB:
>>
>> 	int foo1(void) {
>> 		int x;
>>
>> 		return x;
>> 	}
>>
>> while this is not :
>>
>> 	int foo2(void) {
>> 		int x;
>>
>> 		int * p = &x;
>>
>> 		return x;
>> 	}
>>
>> (Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler
>> in its list.)
>>
>> It strikes me that it would have been far simpler for the standard
>> simply to say that using the value of an uninitialised and unassigned
>> variable is undefined behaviour.
> 
> In C90, it was.  C99 changed that, making the behavior defined if the
> representation is not a trap representation.
> 
> For C99, a conforming IA-64 C compiler would have had to go out of its
> way to avoid accessing NaT representations.  For example, if you wrote
> 
>      {
>          int n;
>          n;
>      }
> 
> the most straightforward IA-64 code would store n in a register and
> not initialize it, resulting in a trap when the register is read.
> A compiler might have to generate code to store an arbitrary value
> in the register to void the trap.
> 
> I'm undecided on whether reading the value of an uninitialized
> automatic object *should* be undefined behavior, but given that
> it isn't, the C11 committee made the smallest possible change to
> cater to IA-64 semantics.
> 

IMHO, having it as UB is the best option, with unspecified behaviour as 
a second best option.  The jumble that C11 has is not necessary for the 
IA-64, and clearly worse than the other two choices for architectures 
that don't have a NaT equivalent.

[toc] | [prev] | [next] | [standalone]


#391480 — Re: int a = a

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2025-03-21 12:23 -0700
SubjectRe: int a = a
Message-ID<87pliaqjb9.fsf@nosuchdomain.example.com>
In reply to#391471
David Brown <david.brown@hesbynett.no> writes:
> On 20/03/2025 20:46, Keith Thompson wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>>> On 20/03/2025 11:20, Keith Thompson wrote:
>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>>>> The "could have been declared with the register storage class"
>>>>>> seems quite odd.  And in fact it is quite odd.
>>>>>
>>>>> I don't have the same reaction.  The point of this phrase is that
>>>>> undefined behavior occurs only for variables that don't have
>>>>> their address taken.  The phrase used describes that nicely.
>>>>> Any questions related to "registerness" can be ignored, because
>>>>> 'register' in C really has nothing to do with hardware registers,
>>>>> despite the name.
>>>> DR 338 is explicitly motivated by an IA-64 feature that applies only
>>>> to
>>>> CPU registers.  An object whose address is taken can't be stored (only)
>>>> in a register, so it can't have a NaT representation.
>>>> The phrase used is "could have been declared with register storage
>>>> class
>>>> (never had its address taken)".  Surely "never had its address taken"
>>>> would have been clear enough if CPU registers weren't a big part of the
>>>> motivation.
>>>
>>> I too think the phrasing is a bit odd.
>>>
>>> Just because a variable's address is taken, does not mean it cannot be
>>> put in a cpu register by the compiler.  If the variable is not
>>> accessed in a way that actually requires putting it in memory, then
>>> the compiler can put it in a cpu register (or otherwise optimise it).
>>> So simply taking the address of a variable on IA-64 does not mean it
>>> cannot be in a register, and thus does not necessarily mean it cannot
>>> be NaT.  Taking the address of a variable means the variable cannot be
>>> declared "register", but it does not mean it cannot be /in/ a
>>> register.
>> Sure, any variable that's stored in memory can be mirrored by
>> holding
>> its value in a register.
>>      int n = 42; // Assume n is assigned a memory address
>>      printf("n+1=%d n+2=%d\n", n+1, n+2);
>> A compiler could plausibly store the value of n in a register before
>> computing n+1, and then reuse the register value to compute n+2.
>
> Yes, of course.  But there is also no necessity for variables to be in
> memory at all, or that there is any consistency there.  "Assume n is
> assigned a memory address" is a completely unwarranted assumption for
> almost all local variables.

I think you misunderstood what I meant by "assume".  Certainly n could
be assigned a memory address.  You can read it as "*IF* n is assigned a
memory address, then ...".  I was asserting that it has a memory address
for purposes of the discussion, not presuming that it must actually have
one.

>                              It is only if the address is taken, and
> used in some way that is beyond the optimiser, that the variable
> actually has to go in a fixed place in memory.  Otherwise optimisers
> can and do keep data in registers, or move them in and out of
> registers and different stack slots according to convenience for
> efficient code.
>
>
> uint32_t float_to_uint(float f) {
>     uint32_t u;
>     memcpy(&u, &f, 4);
>     return u;
> }
>
> gcc compiles that to :
>
> float_to_uint:
>         movd    eax, xmm0
>         ret
>
> So even though the addresses of the variable "u" and the parameter "f"
> are taken, and converted to char pointers, and passed to a function
> with external linkage, nothing is actually put in memory at all.
>
> Thus the standard's wording as though the legality of using the
> "register" storage-class specifier corresponds to cpu register usage
> is, at best, wildly out of date.
>
> (And there are some architectures where the cpu registers are directly
> mapped to memory, and can be accessed as memory locations or
> registers.)
>
>> My understanding is that IA-64 NaT (Not a Thing) representations
>> exist only for registers, and the NaT bit should be cleared when
>> a value is stored in the register.
>> The odd wording in the standard allows an IA-64 C compiler to
>> take advantage of NaT representations for their intended purpose.
>> It might impose some minor constraints on what machine code can be
>> generated, but *most* of the cases where a NaT could be accessed
>> are undefined behavior in C.
>
> I see that, but I believe it would be much simpler and clearer if
> attempting to read an uninitialised and unassigned local variable were
> undefined behaviour in every case.

I probably agree (I haven't given it all that much thought), but the
committee made a specific decision between C90 and C99 to say that 
reading an uninitialized automatic object is *not* undefined behavior.
I'm don't know why they did that (though, all else being equal, reducing
the number of instances of undefined behavior is a good thing), but
reversing that decision for this one issue is not something they decided
to do.

> Alternatively, it could have said that the value is unspecified in
> every case.  Then on the IA-64, the compiler would have to ensure that
> registers do not have their NaT bit set even if they are not
> initialised - this would not be a difficult task.  Enabling use of the
> NaT bit for detection of bugs could then be a compiler option if
> implementations wanted to provide that feature.

The whole point of the NaT bit is to detect accesses to uninitialized
values.  Requiring the compiler to arbitrarily clear that bit
doesn't strike me as a good idea.

I dislike the way that wording was added to the standard specifically
to cater to one specific CPU (which happens to have been discontinued
later).  I would have been happier with a more general solution.
I that making accessing the value of an uninitialized automatic
object UB would have been much cleaner, and it would have allowed for
sensible use of NaT by IA-64 compilers.  But without knowing *why*
the committee removed that UB between C90 and C99, I'm hesitant to
say it was a mistake.

Meanwhile, I will in effect assume that accessing uninitialized objects
is UB, i.e., I'll carefully avoid doing so.

[...]

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#391481 — Re: int a = a

FromDavid Brown <david.brown@hesbynett.no>
Date2025-03-21 21:46 +0100
SubjectRe: int a = a
Message-ID<vrkj78$29u2c$1@dont-email.me>
In reply to#391480
On 21/03/2025 20:23, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:

>> I see that, but I believe it would be much simpler and clearer if
>> attempting to read an uninitialised and unassigned local variable were
>> undefined behaviour in every case.
> 
> I probably agree (I haven't given it all that much thought), but the
> committee made a specific decision between C90 and C99 to say that
> reading an uninitialized automatic object is *not* undefined behavior.
> I'm don't know why they did that (though, all else being equal, reducing
> the number of instances of undefined behavior is a good thing), but
> reversing that decision for this one issue is not something they decided
> to do.

Certainly the C committee have to think harder, and consider more 
possibilities than most mere C programmers are likely to do - and they 
don't like to make something "undefined" if it were defined (to at least 
some extent) previously.

I can agree that it is good to reduce the number of UB's, all else being 
equal - but all else is very seldom equal.  To me, it is preferable to 
say clearly and explicitly "this is undefined behaviour" than to leave 
the C programmer to combine several parts of the standard to figure out 
that the construct might do something defined but unspecified, or might 
do something bad (a trap), or might be defined or undefined depending on 
other mostly unrelated code.

I am a fan of clear undefined behaviour when there is no good definition 
of what the behaviour should be - I'd rather have UB than badly defined 
behaviour.  But I strongly prefer it to be explicit and clear.

> 
>> Alternatively, it could have said that the value is unspecified in
>> every case.  Then on the IA-64, the compiler would have to ensure that
>> registers do not have their NaT bit set even if they are not
>> initialised - this would not be a difficult task.  Enabling use of the
>> NaT bit for detection of bugs could then be a compiler option if
>> implementations wanted to provide that feature.
> 
> The whole point of the NaT bit is to detect accesses to uninitialized
> values.  Requiring the compiler to arbitrarily clear that bit
> doesn't strike me as a good idea.
> 

I think it would be fine to do that - as long as tools also provide 
modes that don't clear the bit so that you can use the feature for 
debugging or run-time checks.  If the behaviour here was to make the 
variable an unspecified value, then in fully compliant modes the 
compiler would have to clear the NaT bit, so the compiler mode making 
use of the NaT bit would be marginally non-compliant.  But I see no 
problem with that - developers can happily use slightly non-compliant 
mode in order to get more features (language extensions, faster 
execution, better debugging - whatever suits the user and the compiler 
implementer).

Still, leaving it undefined behaviour would be even better, because then 
compilers could have flags for using the NaT bit, or clearing the 
variable to 0, or giving a compile-time error - whatever they did it 
would still be compliant.

> I dislike the way that wording was added to the standard specifically
> to cater to one specific CPU (which happens to have been discontinued
> later).  I would have been happier with a more general solution.
> I that making accessing the value of an uninitialized automatic
> object UB would have been much cleaner, and it would have allowed for
> sensible use of NaT by IA-64 compilers.  But without knowing *why*
> the committee removed that UB between C90 and C99, I'm hesitant to
> say it was a mistake.
> 
> Meanwhile, I will in effect assume that accessing uninitialized objects
> is UB, i.e., I'll carefully avoid doing so.
> 

That, I think is the best way to handle this.

[toc] | [prev] | [next] | [standalone]


#391524 — Re: int a = a

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2025-03-22 13:59 -0700
SubjectRe: int a = a
Message-ID<86a59clr3a.fsf@linuxsc.com>
In reply to#391480
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> David Brown <david.brown@hesbynett.no> writes:
>
>> [...]I believe it would be much simpler and clearer if attempting
>> to read an uninitialised and unassigned local variable were
>> undefined behaviour in every case.
>
> I probably agree (I haven't given it all that much thought), but
> the committee made a specific decision between C90 and C99 to say
> that reading an uninitialized automatic object is *not* undefined
> behavior.  I'm don't know why they did that (though, all else
> being equal, reducing the number of instances of undefined
> behavior is a good thing), but reversing that decision for this
> one issue is not something they decided to do.

Your description of what was done is wrong.  It is still the case in
C99 that trying to access an uninitialized object is undefined
behavior, at least potentially, except for accesses using a type
that either is a character type or has no trap representations (and
all types other than unsigned char may have trap representations,
depending on the implementation).  A statement like

    int a = a;

may still be given a warning as potential undefined behavior, even
in C99.

>> Alternatively, it could have said that the value is unspecified
>> in every case.  Then on the IA-64, the compiler would have to
>> ensure that registers do not have their NaT bit set even if they
>> are not initialised - this would not be a difficult task.
>> Enabling use of the NaT bit for detection of bugs could then be a
>> compiler option if implementations wanted to provide that
>> feature.
>
> The whole point of the NaT bit is to detect accesses to
> uninitialized values.  Requiring the compiler to arbitrarily clear
> that bit doesn't strike me as a good idea.
>
> I dislike the way that wording was added to the standard
> specifically to cater to one specific CPU (which happens to have
> been discontinued later).  I would have been happier with a more
> general solution.  I that making accessing the value of an
> uninitialized automatic object UB would have been much cleaner,
> and it would have allowed for sensible use of NaT by IA-64
> compilers.

I think you may be missing a key point.  In both C99 and C11,
accessing an uninitialized object using a character type is defined
(albeit unspecified) behavior.  But in C11, because of the changes
in 6.3.2.1, even character types are subject to undefined behavior
when uninitialized objects are accessed (provided of course that
they don't fall under an exception because their address was taken).
The C11 rule does more than allowing undefined behavior just for
non-character types;  it extends the possibility of undefined
behavior to character types as well.

> But without knowing *why* the committee removed that UB between
> C90 and C99, I'm hesitant to say it was a mistake.

The mistake is thinking that UB for uninitialized access was
removed in C99.  It wasn't.  Narrowed, yes;  removed, no.  And
later C11 simply widened it back a bit, recovering some of the
territory that had been taken away in C99.  The Itanium may
have been what prompted the change, but the change that was made
is one well worth making.

[toc] | [prev] | [next] | [standalone]


#391529 — Re: int a = a

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2025-03-22 15:37 -0700
SubjectRe: int a = a
Message-ID<87ldsw8zer.fsf@nosuchdomain.example.com>
In reply to#391524
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> David Brown <david.brown@hesbynett.no> writes:
>>> [...]I believe it would be much simpler and clearer if attempting
>>> to read an uninitialised and unassigned local variable were
>>> undefined behaviour in every case.
>>
>> I probably agree (I haven't given it all that much thought), but
>> the committee made a specific decision between C90 and C99 to say
>> that reading an uninitialized automatic object is *not* undefined
>> behavior.  I'm don't know why they did that (though, all else
>> being equal, reducing the number of instances of undefined
>> behavior is a good thing), but reversing that decision for this
>> one issue is not something they decided to do.
>
> Your description of what was done is wrong.  It is still the case in
> C99 that trying to access an uninitialized object is undefined
> behavior, at least potentially, except for accesses using a type
> that either is a character type or has no trap representations (and
> all types other than unsigned char may have trap representations,
> depending on the implementation).  A statement like
>
>     int a = a;
>
> may still be given a warning as potential undefined behavior, even
> in C99.

I had already mentioned that distinction earlier in the thread.

[...]

> The mistake is thinking that UB for uninitialized access was
> removed in C99.  It wasn't.  Narrowed, yes;  removed, no.

Acknowledged.

[...]

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#392983 — Re: int a = a

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2025-04-28 09:39 -0700
SubjectRe: int a = a
Message-ID<86frhs8ckt.fsf@linuxsc.com>
In reply to#391529
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>
>>> David Brown <david.brown@hesbynett.no> writes:
>>>
>>>> [...]I believe it would be much simpler and clearer if attempting
>>>> to read an uninitialised and unassigned local variable were
>>>> undefined behaviour in every case.
>>>
>>> I probably agree (I haven't given it all that much thought), but
>>> the committee made a specific decision between C90 and C99 to say
>>> that reading an uninitialized automatic object is *not* undefined
>>> behavior.  I'm don't know why they did that (though, all else
>>> being equal, reducing the number of instances of undefined
>>> behavior is a good thing), but reversing that decision for this
>>> one issue is not something they decided to do.
>>
>> Your description of what was done is wrong.  It is still the case in
>> C99 that trying to access an uninitialized object is undefined
>> behavior, at least potentially, except for accesses using a type
>> that either is a character type or has no trap representations (and
>> all types other than unsigned char may have trap representations,
>> depending on the implementation).  A statement like
>>
>>     int a = a;
>>
>> may still be given a warning as potential undefined behavior, even
>> in C99.
>
> I had already mentioned that distinction earlier in the thread.

Oh, I must have missed that.  I don't remember seeing it in
the message I was replying to.

>> The mistake is thinking that UB for uninitialized access was
>> removed in C99.  It wasn't.  Narrowed, yes;  removed, no.
>
> Acknowledged.

Good deal.

[toc] | [prev] | [next] | [standalone]


#393053 — Re: int a = a

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2025-04-29 13:12 -0700
SubjectRe: int a = a
Message-ID<86bjse917i.fsf@linuxsc.com>
In reply to#391391
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> [how to indicate a variable not being used is okay]
>> [some quoted text rearranged]
>>
>>> Unless I'm missing something, `(void)x` also has undefined beahvior
>>> if x is uninitialized,
>>
>> Right.  Using (void)&x is better.
>
> I'm not convinced -- and it's far less idiomatic.

Both phrases are idiomatic.  What you mean is one phrase is more
common than the other.  More common doesn't mean better.  Recall
Dijkstra's dictum, not to conclude that something is more convenient
just because it's more conventional.

> I don't think
> I've ever seen (void)&x in code, and if I did I'd wonder what the
> author's intent was.

The same is true for any construction seen for the first time,
and like other such cases either you would figure it out or
look/ask around to find out.  And then you'd know.

Furthermore, having gotten the benefit of this discussion, you
wouldn't have to do that, because you've seen it already.

> (void)x is a common idiom for hinting to the compiler that it
> doesn't need to complain about x being unused.  (void)&x doesn't
> tell the compiler that the *value* of x is used.  I'm not sure how
> much difference that makes.

Both have the effect of getting rid of the warning even if placed
after a 'return' statement so as not to be executed.

> Even with (void)x and/or (void)&x, a compiler *could* still warn
> about x being unused, or about the programmer's use of an ugly font.

Yes, it could.  At such time that it happens I expect I would
react and adapt accordingly, the same as with all questionable
compiler behaviors.

>>> though it's very likely to do nothing in practice.
>>
>> Unless x is volatile qualified, in which there must be an access
>> to x in the generated code.
>>
>>> The behavior [of int a = a;] is undefined.  In C11 and later
>>> (N1570 6.3.2.1p2):
>>>
>>>     Except when [...] an lvalue that does not have array type is
>>>     converted to the value stored in the designated object (and is
>>>     no longer an lvalue); this is called lvalue conversion.
>>>     [...]
>>>     If the lvalue designates an object of automatic storage
>>>     duration that could have been declared with the register
>>>     storage class (never had its address taken), and that object
>>>     is uninitialized (not declared with an initializer and no
>>>     assignment to it has been performed prior to use), the
>>>     behavior is undefined.
>>>
>>> Long digression follows.
>>>
>>> The "could have been declared with the register storage class"
>>> seems quite odd.  And in fact it is quite odd.
>>
>> I don't have the same reaction.  The point of this phrase is that
>> undefined behavior occurs only for variables that don't have
>> their address taken.  The phrase used describes that nicely.
>> Any questions related to "registerness" can be ignored, because
>> 'register' in C really has nothing to do with hardware registers,
>> despite the name.
>
> DR 338 is explicitly motivated by an IA-64 feature that applies only to
> CPU registers.  An object whose address is taken can't be stored (only)
> in a register, so it can't have a NaT representation.
>
> The phrase used is "could have been declared with register storage class
> (never had its address taken)".  Surely "never had its address taken"
> would have been clear enough if CPU registers weren't a big part of the
> motivation.

I'm surprised you would say this.  The phrase "never had its address
taken" doesn't satisfy the careful language threshold observed in
the ISO C standard.  Do you really not understand this?

>>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
>>>
>>> So the "could have been declared with the register storage class"
>>> wording was added in C11 specifically to cater to the IA64.  This
>>> change would have been superfluous in C90, where the behavior was
>>> undefined anyway, but is a semantically significant change between
>>> C99 and C11.  (If some future CPU has something like NaT that can
>>> be stored in memory, the wording might need to be updated yet again.)
>>>
>>> My takeaway is that if it requires this much research to determine
>>> whether accessing the value of an uninitialized object has undefined
>>> behavior (in which circumstances and which edition of the standard),
>>> I'll just avoid doing so altogether.  I'll initialize objects
>>> when they're defined whenever practical.  If it's not practical
>>> for some reason, I won't initialize it with some dummy value;  I'll
>>> leave it uninitialized so the compiler has a chance to warn me if
>>> I accidentally use it before assigning a value to it.
>>
>> I think you are overthinking the question.  In cases where it's
>> important to give an initial value to a variable, and can be done
>> so at the point of its declaration, use an initializer;  otherwise
>> don't.
>
> My overthinking led me to essentially the same conclusion, so I don't
> see the problem.  And I also found it to be an interesting exploration
> of how certain aspects of the C standard have evolved over time.

Doing more thinking than is needed is a waste of effort.  I can only
hope that you have better things to do with your time.  Furthermore
spending any time dwelling on the Itanium being the motivation for
the change is just a distraction.  It was interesting to learn, but
having learned it there is no need to consider it further.

>>         We don't have to read several different C standards, or
>> even only one, to reach that conclusion.
>
> No, but we do have to read one or more C standards to counter an
> argument that `int a = a;` is well defined.

Only if one feels it necessary to convince someone who holds such
an uneducated view.  I don't mind pointing someone in the right
direction, but it's not my job to convince them.

>>                                           If someone wants to know
>> exactly which border cases are safe and which cases are not, then
>> reading the relevant version(s) of the C standard is needed, but
>> in most situations it isn't.  It's important for the C standard to
>> be precise about what it prescribes, but as far as initialization
>> goes it's easy to write code that doesn't need that level of
>> detail.  Compiler writers need to know such things;  in the
>> particular case of when and where to initialize, most developers
>> don't.
>
> Most developers don't read this newsgroup.

Probably true, but there plenty of places where one can find out
these things besides comp.lang.c.

[toc] | [prev] | [next] | [standalone]


#393054 — Re: int a = a

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2025-04-29 13:34 -0700
SubjectRe: int a = a
Message-ID<87plguzozg.fsf@nosuchdomain.example.com>
In reply to#393053
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>> [how to indicate a variable not being used is okay]
>>> [some quoted text rearranged]
>>>
>>>> Unless I'm missing something, `(void)x` also has undefined beahvior
>>>> if x is uninitialized,
>>>
>>> Right.  Using (void)&x is better.
>>
>> I'm not convinced -- and it's far less idiomatic.
>
> Both phrases are idiomatic.  What you mean is one phrase is more
> common than the other.  More common doesn't mean better.  Recall
> Dijkstra's dictum, not to conclude that something is more convenient
> just because it's more conventional.

[...]

Just so you're aware, I've read your post and I have nothing more
to say about it.

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#391405 — Re: int a = a

FromDavid Brown <david.brown@hesbynett.no>
Date2025-03-20 15:42 +0100
SubjectRe: int a = a
Message-ID<vrh9fu$3e7sn$1@dont-email.me>
In reply to#391374
On 19/03/2025 21:34, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
> [...]
>> As far as I understand it (and I hope to be corrected if I am wrong),
> 
> Your hope is about to be fulfilled.
> 
>> "int a = a;" is not undefined behaviour as long as the implementation
>> does not have trap values for "int".  It simply leaves "a" as an
>> unspecified value - just like "int a;" does.  Thus it is not in any
>> way "worse" than "int a;" as far as C semantics are concerned.  Any
>> difference is a matter of implementation - and the usual
>> implementation effect is to disable "not initialised" warnings.
> 
> The behavior is undefined.  In C11 and later (N1570 6.3.2.1p2):
> 
>      Except when [...] an lvalue that does not have array type is
>      converted to the value stored in the designated object (and is no
>      longer an lvalue); this is called lvalue conversion.
>      [...]
>      If the lvalue designates an object of automatic storage duration that
>      could have been declared with the register storage class (never had
>      its address taken), and that object is uninitialized (not declared
>      with an initializer and no assignment to it has been performed prior
>      to use), the behavior is undefined.
> 

OK.  I had missed that for some reason.  Elsewhere (6.7.9p10, under 
"initialization") the standard says the value is "indeterminate", which 
is defined as an "unspecified or trap" value.

>> It is in much the same category as "(void) x;", which is an idiom for
>> skipping an "unused variable" or "unused parameter" warning.
> 
> Unless I'm missing something, `(void)x` also has undefined beahvior
> if x is uninitialized, though it's very likely to do nothing in
> practice.

The situation where "(void) x;" is most useful is, I would say, unused 
parameters.  So there is no undefined behaviour there.  And for other 
variables it is most likely in situations where you have assigned to the 
variable but then don't use it (perhaps you plan to use it later). 
Maybe you have "status = do_something();", and then don't actually make 
use of "status" - casting it to void tells both the compiler and the 
reader that you know "do_something()" is returning a status indicator, 
but that you are then ignoring it.  If you are simply declaring a 
variable without initialising it and you don't want to use it and don't 
want to be warned about it, it's probably just as easy (and definitely 
avoids UB) to remove the declaration.

> 
> Long digression follows.
> 
> The "could have been declared with the register storage class" seems
> quite odd.  And in fact it is quite odd.
> 
> It's tempting to assume that `int n = n;` did not have undefined
> behavior prior to C11, or that accessing an automatic object whose
> address has not been taken does not have undefined behavior even
> in C11 or later, but it's not that simple.
> 
> In C90, the non-normative Annex G (renamed to Annex J in later
> editions) says:
> 
>      The behavior in the following circumstances is undefined:
>      [...]
>      - The value of an uninitialized object that has automatic storage
>        duration is used before a value is assigned (6.5.7).
> 
> 6.5.7 discusses initialization, and says that "If an object that
> has automatic storage duration is not initialized explicitly, its
> value is indeterminate", and C90's definition of "undefined behavior"
> explicitly refers to use of indeterminately valued objects, though
> it's not 100% clear that using an indeterminate value *always*
> has undefined behavior.
> 
> So in C90, `int n = n;` explicitly had undefined behavior, even if
> all possible bit representations for an object of type int correspond
> to valid values (C90 didn't mention "trap representations").
> 
> C99 added a definition for "indeterminate value": "either an
> unspecified value or a trap representation", and drops the mention
> of indeterminate values in the definition of "undefined behavior".
> It dropped the reference to uninitialized objects in Annex G/J.
> I believe that in C99, `int n = n;` is well defined *if* int
> has no trap representations, or if the representation stored in
> the memory occupied by n happens not to be a trap representation.
> If int has trap representations, and that memory happens to contain
> such a representation, the behavior is undefined.
> 
> I found a discussion in comp.std.c from 2023, subject "Does reading
> an uninitialized object have undefined behavior?".
> 
> The discontinued IA-64/Itanium processor had something called
> "NaT", "Not a Thing".  NaT representations exist only in CPU
> registers, not in memory.  (Imagine an extra bit for each register
> indicating whether the register contains a "thing".)  A NaT allows
> for representations that act like C trap representations (called
> non-value representations in C23) even for types with no trap
> representations (for example where all 2**N possible representations
> correspond to valid values) -- but again, only in CPU registers.
> 
> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
> 
> So the "could have been declared with the register storage class"
> wording was added in C11 specifically to cater to the IA64.  This
> change would have been superfluous in C90, where the behavior was
> undefined anyway, but is a semantically significant change between
> C99 and C11.  (If some future CPU has something like NaT that can
> be stored in memory, the wording might need to be updated yet again.)
> 
> My takeaway is that if it requires this much research to determine
> whether accessing the value of an uninitialized object has undefined
> behavior (in which circumstances and which edition of the standard),
> I'll just avoid doing so altogether.  I'll initialize objects
> when they're defined whenever practical.  If it's not practical
> for some reason, I won't initialize it with some dummy value; I'll
> leave it uninitialized so the compiler has a chance to warn me if
> I accidentally use it before assigning a value to it.
> 

Thanks for that explanation.

My opinions here match your "takeaway" entirely.  Just because I have 
seen "int a = a;", and know how gcc (and perhaps other compilers) handle 
it, does not mean I think it is a good thing to write!


[toc] | [prev] | [next] | [standalone]


#391314 — Re: int a = a (Was: Bart's Language)

Fromscott@slp53.sl.home (Scott Lurndal)
Date2025-03-18 19:37 +0000
SubjectRe: int a = a (Was: Bart's Language)
Message-ID<5YjCP.354514$8sk5.142044@fx02.iad>
In reply to#391310
gazelle@shell.xmission.com (Kenny McCormack) writes:
>In article <vrc75b$2r4lt$1@dont-email.me>,
>David Brown  <david.brown@hesbynett.no> wrote:
>...
>>> gcc won't warn until you say '-Wextra', and then only for:
>> >
>> >    int a = a + 1;
>>
>>People would not normally write "int a = a;".  It is used as a common 
>>idiom meaning "I know it is not clear to the compiler that the variable 
>>is always initialised before use, but /I/ know it is - so disable the 
>>use-without-initialisation warnings for this variable".  So it makes 
>>perfect sense for the compiler not to warn about it!
>
>Wouldn't it just be easier and clearer to write: int a = 0;
>and be done with it?

Would cost an additional instruction at least...

I've never seen the construct 'int a = a;' ever used, myself.

I'll pay the extra instruction for a deterministic value.

[toc] | [prev] | [next] | [standalone]


#391315 — Re: int a = a (Was: Bart's Language)

FromDavid Brown <david.brown@hesbynett.no>
Date2025-03-18 20:51 +0100
SubjectRe: int a = a (Was: Bart's Language)
Message-ID<vrcisg$35ffo$1@dont-email.me>
In reply to#391310
On 18/03/2025 19:04, Kenny McCormack wrote:
> In article <vrc75b$2r4lt$1@dont-email.me>,
> David Brown  <david.brown@hesbynett.no> wrote:
> ...
>>> gcc won't warn until you say '-Wextra', and then only for:
>>>
>>>     int a = a + 1;
>>
>> People would not normally write "int a = a;".  It is used as a common
>> idiom meaning "I know it is not clear to the compiler that the variable
>> is always initialised before use, but /I/ know it is - so disable the
>> use-without-initialisation warnings for this variable".  So it makes
>> perfect sense for the compiler not to warn about it!
> 
> Wouldn't it just be easier and clearer to write: int a = 0;
> and be done with it?

Write that if that's what you want.

I don't think I have ever actually written "int a = a;" in my own code - 
but I know the idiom.  In almost all cases, I don't declare a variable 
until I have something to put in it, so I have a real initialiser.  And 
if I don't, I prefer to leave it uninitialised - then the compiler can 
tell me if I haven't initialised it when I use it.  "int a = a;" would 
only be useful in fairly niche cases.

> 
>> "int a = a + 1;", on the other hand, clearly attempts to read the value
>> of "a" before it is initialised, and a warning is issued if
>> "-Wuninitialized" is enabled.  This warning is part of "-Wall".
> 
> How is: int a = a + 1;
> conceptually different from: int a = a;
> 
> Both are expressions involving 'a'.
> Isn't 'a' being used un-initialised in both cases?
> 

The case "int a = a;" is recognised as an idiom by a number of compilers 
(such as gcc).  A brief check suggests that gcc will generate code as it 
would for "int a = 0;", but it is certainly possible for a compiler to 
avoid any kind of initialisation here and let the register or stack slot 
used for "a" stay as it was.  That would be a pretty minor efficiency 
improvement, but optimised code is mostly the sum of lots of tiny 
improvements.

> (You have to know the value of 'a' in order to evaluate the expression: a)
> 


[toc] | [prev] | [next] | [standalone]


#391318 — Re: int a = a (Was: Bart's Language)

FromJanis Papanagnou <janis_papanagnou+ng@hotmail.com>
Date2025-03-18 23:27 +0100
SubjectRe: int a = a (Was: Bart's Language)
Message-ID<vrcs09$3ejvg$1@dont-email.me>
In reply to#391315
On 18.03.2025 20:51, David Brown wrote:
> [...]  A brief check suggests that gcc will generate code as it
> would for "int a = 0;", but it is certainly possible for a compiler to
> avoid any kind of initialisation here and let the register or stack slot
> used for "a" stay as it was.  That would be a pretty minor efficiency
> improvement,

> but optimised code is mostly the sum of lots of tiny improvements.

Interesting view. I've learned that such Peephole Optimizations were
not what contribute to optimizations most. It's rather transformations
of structure of various forms that is what "mostly" matters. - That's
what I've learned many decades ago, of course. - So I'm curious where
you've got that view from. (Some reference, maybe? Or was that just a
personal opinion?)

Janis, wondering

[toc] | [prev] | [next] | [standalone]


#391342 — Re: int a = a (Was: Bart's Language)

FromDavid Brown <david.brown@hesbynett.no>
Date2025-03-19 11:40 +0100
SubjectRe: int a = a (Was: Bart's Language)
Message-ID<vre6v0$lb74$1@dont-email.me>
In reply to#391318
On 18/03/2025 23:27, Janis Papanagnou wrote:
> On 18.03.2025 20:51, David Brown wrote:
>> [...]  A brief check suggests that gcc will generate code as it
>> would for "int a = 0;", but it is certainly possible for a compiler to
>> avoid any kind of initialisation here and let the register or stack slot
>> used for "a" stay as it was.  That would be a pretty minor efficiency
>> improvement,
> 
>> but optimised code is mostly the sum of lots of tiny improvements.
> 
> Interesting view. I've learned that such Peephole Optimizations were
> not what contribute to optimizations most. It's rather transformations
> of structure of various forms that is what "mostly" matters. - That's
> what I've learned many decades ago, of course. - So I'm curious where
> you've got that view from. (Some reference, maybe? Or was that just a
> personal opinion?)
> 

Some optimisations have big effects, certainly - good register 
allocation and lifetime analysis, and optimisations that move code 
around (loop transformations, inlining, etc.) are the big factors. 
However, in modern compilers there are lots of minor optimisations that 
only apply in a few cases and only help a few percent in those cases. 
Each does little on its own, but in sum the results can be significant.

But you are right that it is not really fair to say that optimisation is 
"mostly" the sum of tiny improvements - it's a small number of big and 
important transforms, and /then/ the sum of a large number of small ones.

One complicating factor about these small optimisations is that the 
observable effect on code speed is highly dependent on the rest of the 
code and the type of processor involved.  A peephole optimisation that 
removes an extra register-to-register move will save a cycle on a 
microcontroller, but on an x86 system such a move might be merged in the 
register renaming hardware of the cpu's prefetch queues and thus 
disappear entirely.  Reducing instruction cycles matters a lot on 
microcontrollers, while on a big processor they might not make any 
difference if the cpu is waiting for memory.

[toc] | [prev] | [next] | [standalone]


#391334 — Re: int a = a (Was: Bart's Language)

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2025-03-18 23:52 -0700
SubjectRe: int a = a (Was: Bart's Language)
Message-ID<86zfhhpl5n.fsf@linuxsc.com>
In reply to#391310
gazelle@shell.xmission.com (Kenny McCormack) writes:

> In article <vrc75b$2r4lt$1@dont-email.me>,
> David Brown  <david.brown@hesbynett.no> wrote:
> ...
>
>>> gcc won't warn until you say '-Wextra', and then only for:
>>>
>>>    int a = a + 1;
>>
>> People would not normally write "int a = a;".  It is used as a
>> common idiom meaning "I know it is not clear to the compiler that
>> the variable is always initialised before use, but /I/ know it is -
>> so disable the use-without-initialisation warnings for this
>> variable".  So it makes perfect sense for the compiler not to warn
>> about it!

An addle-brained view.  Anyone who thinks that should be forcibly
removed from any activity involving software development.

> Wouldn't it just be easier and clearer to write:  int a = 0;
> and be done with it?

There are two problems:  one, the semantics are different;  and two,
the impression given of the author's intent is different.  It's kind
of like saying "isn't it just easier and clearer to write 'red'
rather than 'yellow'?"  Writing 'int a = 0;' might be better or it
might be worse, depending on one's point of view, but it shouldn't
be considered either more clear or less clear, because it isn't
saying the same thing.

[toc] | [prev] | [next] | [standalone]


#391338 — Re: int a = a

FromKeith Thompson <Keith.S.Thompson+u@gmail.com>
Date2025-03-19 01:55 -0700
SubjectRe: int a = a
Message-ID<871putv1qr.fsf@nosuchdomain.example.com>
In reply to#391334
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]
> An addle-brained view.  Anyone who thinks that should be forcibly
> removed from any activity involving software development.

Be less rude.

[...]

-- 
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */

[toc] | [prev] | [next] | [standalone]


#392927 — Re: int a = a

FromTim Rentsch <tr.17687@z991.linuxsc.com>
Date2025-04-27 13:41 -0700
SubjectRe: int a = a
Message-ID<86jz758hh4.fsf@linuxsc.com>
In reply to#391338
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> [...]
>
>> An addle-brained view.  Anyone who thinks that should be forcibly
>> removed from any activity involving software development.
>
> Be less rude.

My comment was a statement about content.  If I had said
"David Brown is an addle-brained fool" that would be a
statement about a person.  What I did say was not.  If you
want to disagree with my statements about content, you
are welcome to express an opposing view.  As long as a
statement is about content, rather than about a person,
there is nothing wrong with expressing it in any degree
of strong language.

For the record, there are plenty of behaviors that you
engage in that I find offensive, insulting, or rude,
including statements made about people.  People who
live in glass houses shouldn't throw stones.

I note with amusement that you over-snipped the context
to which I was replying.

[toc] | [prev] | [next] | [standalone]


Page 2 of 4 — ← Prev page 1 [2] 3 4  Next page →

Back to top | Article view | comp.lang.c


csiph-web