Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #391294 > unrolled thread
| Started by | bart <bc@freeuk.com> |
|---|---|
| First post | 2025-03-17 23:51 +0000 |
| Last post | 2025-03-21 00:33 +0000 |
| Articles | 20 on this page of 62 — 13 participants |
Back to article view | Back to comp.lang.c
Bart's Language bart <bc@freeuk.com> - 2025-03-17 23:51 +0000
Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-18 12:17 +0000
Re: Bart's Language bart <bc@freeuk.com> - 2025-03-18 13:54 +0000
Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-18 15:10 +0000
Re: Bart's Language bart <bc@freeuk.com> - 2025-03-18 15:45 +0000
Re: Bart's Language David Brown <david.brown@hesbynett.no> - 2025-03-18 17:31 +0100
int a = a (Was: Bart's Language) gazelle@shell.xmission.com (Kenny McCormack) - 2025-03-18 18:04 +0000
Re: int a = a (Was: Bart's Language) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-03-18 19:36 +0100
Re: int a = a (Was: Bart's Language) Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-18 19:11 +0000
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-19 15:56 +0100
Re: int a = a (Was: Bart's Language) scott@slp53.sl.home (Scott Lurndal) - 2025-03-19 16:38 +0000
Re: int a = a (Was: Bart's Language) "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2025-03-19 14:29 -0700
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-20 09:39 +0100
Re: int a = a (Was: Bart's Language) bart <bc@freeuk.com> - 2025-03-20 11:59 +0000
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-20 15:46 +0100
Re: int a = a (Was: Bart's Language) wij <wyniijj5@gmail.com> - 2025-03-20 23:13 +0800
Re: int a = a (Was: Bart's Language) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-20 02:02 -0700
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-20 15:57 +0100
Re: int a = a (Was: Bart's Language) Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-19 17:07 +0000
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-19 13:34 -0700
Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-20 02:54 -0700
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-20 03:20 -0700
Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-20 16:22 +0100
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-20 12:46 -0700
Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-21 10:44 +0100
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-21 12:23 -0700
Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-21 21:46 +0100
Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-22 13:59 -0700
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-22 15:37 -0700
Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-28 09:39 -0700
Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-29 13:12 -0700
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-04-29 13:34 -0700
Re: int a = a David Brown <david.brown@hesbynett.no> - 2025-03-20 15:42 +0100
Re: int a = a (Was: Bart's Language) scott@slp53.sl.home (Scott Lurndal) - 2025-03-18 19:37 +0000
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-18 20:51 +0100
Re: int a = a (Was: Bart's Language) Janis Papanagnou <janis_papanagnou+ng@hotmail.com> - 2025-03-18 23:27 +0100
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-19 11:40 +0100
Re: int a = a (Was: Bart's Language) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-18 23:52 -0700
Re: int a = a Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-19 01:55 -0700
Re: int a = a Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-27 13:41 -0700
Re: int a = a (Was: Bart's Language) David Brown <david.brown@hesbynett.no> - 2025-03-19 11:43 +0100
Re: int a = a (Was: Bart's Language) Rosario19 <Ros@invalid.invalid> - 2025-03-19 13:23 +0100
Re: int a = a (Was: Bart's Language) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-20 01:32 -0700
Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-20 22:55 +0000
Re: Bart's Language Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-03-20 16:22 -0700
Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-22 14:37 +0000
Re: Bart's Language James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-03-22 11:41 -0400
Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-22 16:52 +0000
Re: Bart's Language James Kuyper <jameskuyper@alumni.caltech.edu> - 2025-03-22 20:12 -0400
By definition... (Was: Bart's Language) gazelle@shell.xmission.com (Kenny McCormack) - 2025-03-23 17:20 +0000
Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-04-27 11:53 -0700
Re: Bart's Language Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2025-04-27 14:29 -0700
Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-01-06 14:04 -0800
Re: Bart's Language Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2026-01-06 17:12 -0800
Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2026-03-06 09:04 -0800
Re: Bart's Language bart <bc@freeuk.com> - 2025-03-18 22:19 +0000
Re: Bart's Language antispam@fricas.org (Waldek Hebisch) - 2025-03-20 22:38 +0000
Re: Bart's Language Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-20 23:45 +0000
Re: Bart's Language bart <bc@freeuk.com> - 2025-03-21 00:56 +0000
Re: Bart's Language Kaz Kylheku <643-408-1753@kylheku.com> - 2025-03-21 17:47 +0000
Re: Bart's Language Tim Rentsch <tr.17687@z991.linuxsc.com> - 2025-03-22 07:12 -0700
Re: Bart's Language bart <bc@freeuk.com> - 2025-03-21 00:33 +0000
Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2025-03-20 02:54 -0700 |
| Subject | Re: int a = a |
| Message-ID | <86zfhgni2a.fsf@linuxsc.com> |
| In reply to | #391374 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: [how to indicate a variable not being used is okay] [some quoted text rearranged] > Unless I'm missing something, `(void)x` also has undefined beahvior > if x is uninitialized, Right. Using (void)&x is better. > though it's very likely to do nothing in practice. Unless x is volatile qualified, in which there must be an access to x in the generated code. > The behavior [of int a = a;] is undefined. In C11 and later > (N1570 6.3.2.1p2): > > Except when [...] an lvalue that does not have array type is > converted to the value stored in the designated object (and is > no longer an lvalue); this is called lvalue conversion. > [...] > If the lvalue designates an object of automatic storage > duration that could have been declared with the register > storage class (never had its address taken), and that object > is uninitialized (not declared with an initializer and no > assignment to it has been performed prior to use), the > behavior is undefined. > Long digression follows. > > The "could have been declared with the register storage class" > seems quite odd. And in fact it is quite odd. I don't have the same reaction. The point of this phrase is that undefined behavior occurs only for variables that don't have their address taken. The phrase used describes that nicely. Any questions related to "registerness" can be ignored, because 'register' in C really has nothing to do with hardware registers, despite the name. > It's tempting to assume that `int n = n;` did not have undefined > behavior prior to C11, or that accessing an automatic object whose > address has not been taken does not have undefined behavior even > in C11 or later, but it's not that simple. > > In C90, the non-normative Annex G (renamed to Annex J in later > editions) says: > > The behavior in the following circumstances is undefined: > [...] > - The value of an uninitialized object that has automatic storage > duration is used before a value is assigned (6.5.7). > > 6.5.7 discusses initialization, and says that "If an object that > has automatic storage duration is not initialized explicitly, its > value is indeterminate", and C90's definition of "undefined behavior" > explicitly refers to use of indeterminately valued objects, though > it's not 100% clear that using an indeterminate value *always* > has undefined behavior. > > So in C90, `int n = n;` explicitly had undefined behavior, even if > all possible bit representations for an object of type int correspond > to valid values (C90 didn't mention "trap representations"). > > C99 added a definition for "indeterminate value": "either an > unspecified value or a trap representation", and drops the mention > of indeterminate values in the definition of "undefined behavior". > It dropped the reference to uninitialized objects in Annex G/J. > I believe that in C99, `int n = n;` is well defined *if* int > has no trap representations, or if the representation stored in > the memory occupied by n happens not to be a trap representation. > If int has trap representations, and that memory happens to contain > such a representation, the behavior is undefined. > > I found a discussion in comp.std.c from 2023, subject "Does reading > an uninitialized object have undefined behavior?". > > The discontinued IA-64/Itanium processor had something called > "NaT", "Not a Thing". NaT representations exist only in CPU > registers, not in memory. (Imagine an extra bit for each register > indicating whether the register contains a "thing".) A NaT allows > for representations that act like C trap representations (called > non-value representations in C23) even for types with no trap > representations (for example where all 2**N possible representations > correspond to valid values) -- but again, only in CPU registers. > > https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm > > So the "could have been declared with the register storage class" > wording was added in C11 specifically to cater to the IA64. This > change would have been superfluous in C90, where the behavior was > undefined anyway, but is a semantically significant change between > C99 and C11. (If some future CPU has something like NaT that can > be stored in memory, the wording might need to be updated yet again.) > > My takeaway is that if it requires this much research to determine > whether accessing the value of an uninitialized object has undefined > behavior (in which circumstances and which edition of the standard), > I'll just avoid doing so altogether. I'll initialize objects > when they're defined whenever practical. If it's not practical > for some reason, I won't initialize it with some dummy value; I'll > leave it uninitialized so the compiler has a chance to warn me if > I accidentally use it before assigning a value to it. I think you are overthinking the question. In cases where it's important to give an initial value to a variable, and can be done so at the point of its declaration, use an initializer; otherwise don't. We don't have to read several different C standards, or even only one, to reach that conclusion. If someone wants to know exactly which border cases are safe and which cases are not, then reading the relevant version(s) of the C standard is needed, but in most situations it isn't. It's important for the C standard to be precise about what it prescribes, but as far as initialization goes it's easy to write code that doesn't need that level of detail. Compiler writers need to know such things; in the particular case of when and where to initialize, most developers don't.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2025-03-20 03:20 -0700 |
| Subject | Re: int a = a |
| Message-ID | <87cyect356.fsf@nosuchdomain.example.com> |
| In reply to | #391390 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
> [how to indicate a variable not being used is okay]
> [some quoted text rearranged]
>
>> Unless I'm missing something, `(void)x` also has undefined beahvior
>> if x is uninitialized,
>
> Right. Using (void)&x is better.
I'm not convinced -- and it's far less idiomatic. I don't think
I've ever seen (void)&x in code, and if I did I'd wonder what the
author's intent was.
(void)x is a common idiom for hinting to the compiler that it
doesn't need to complain about x being unused. (void)&x doesn't
tell the compiler that the *value* of x is used. I'm not sure how
much difference that makes.
Even with (void)x and/or (void)&x, a compiler *could* still warn
about x being unused, or about the programmer's use of an ugly font.
>> though it's very likely to do nothing in practice.
>
> Unless x is volatile qualified, in which there must be an access
> to x in the generated code.
>
>> The behavior [of int a = a;] is undefined. In C11 and later
>> (N1570 6.3.2.1p2):
>>
>> Except when [...] an lvalue that does not have array type is
>> converted to the value stored in the designated object (and is
>> no longer an lvalue); this is called lvalue conversion.
>> [...]
>> If the lvalue designates an object of automatic storage
>> duration that could have been declared with the register
>> storage class (never had its address taken), and that object
>> is uninitialized (not declared with an initializer and no
>> assignment to it has been performed prior to use), the
>> behavior is undefined.
>
>> Long digression follows.
>>
>> The "could have been declared with the register storage class"
>> seems quite odd. And in fact it is quite odd.
>
> I don't have the same reaction. The point of this phrase is that
> undefined behavior occurs only for variables that don't have
> their address taken. The phrase used describes that nicely.
> Any questions related to "registerness" can be ignored, because
> 'register' in C really has nothing to do with hardware registers,
> despite the name.
DR 338 is explicitly motivated by an IA-64 feature that applies only to
CPU registers. An object whose address is taken can't be stored (only)
in a register, so it can't have a NaT representation.
The phrase used is "could have been declared with register storage class
(never had its address taken)". Surely "never had its address taken"
would have been clear enough if CPU registers weren't a big part of the
motivation.
[SNIP]
>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm
>>
>> So the "could have been declared with the register storage class"
>> wording was added in C11 specifically to cater to the IA64. This
>> change would have been superfluous in C90, where the behavior was
>> undefined anyway, but is a semantically significant change between
>> C99 and C11. (If some future CPU has something like NaT that can
>> be stored in memory, the wording might need to be updated yet again.)
>>
>> My takeaway is that if it requires this much research to determine
>> whether accessing the value of an uninitialized object has undefined
>> behavior (in which circumstances and which edition of the standard),
>> I'll just avoid doing so altogether. I'll initialize objects
>> when they're defined whenever practical. If it's not practical
>> for some reason, I won't initialize it with some dummy value; I'll
>> leave it uninitialized so the compiler has a chance to warn me if
>> I accidentally use it before assigning a value to it.
>
> I think you are overthinking the question. In cases where it's
> important to give an initial value to a variable, and can be done
> so at the point of its declaration, use an initializer; otherwise
> don't.
My overthinking led me to essentially the same conclusion, so I don't
see the problem. And I also found it to be an interesting exploration
of how certain aspects of the C standard have evolved over time.
> We don't have to read several different C standards, or
> even only one, to reach that conclusion.
No, but we do have to read one or more C standards to counter an
argument that `int a = a;` is well defined.
> If someone wants to know
> exactly which border cases are safe and which cases are not, then
> reading the relevant version(s) of the C standard is needed, but
> in most situations it isn't. It's important for the C standard to
> be precise about what it prescribes, but as far as initialization
> goes it's easy to write code that doesn't need that level of
> detail. Compiler writers need to know such things; in the
> particular case of when and where to initialize, most developers
> don't.
Most developers don't read this newsgroup.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2025-03-20 16:22 +0100 |
| Subject | Re: int a = a |
| Message-ID | <vrhbsf$3e7sn$4@dont-email.me> |
| In reply to | #391391 |
On 20/03/2025 11:20, Keith Thompson wrote:
> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>
>>> The "could have been declared with the register storage class"
>>> seems quite odd. And in fact it is quite odd.
>>
>> I don't have the same reaction. The point of this phrase is that
>> undefined behavior occurs only for variables that don't have
>> their address taken. The phrase used describes that nicely.
>> Any questions related to "registerness" can be ignored, because
>> 'register' in C really has nothing to do with hardware registers,
>> despite the name.
>
> DR 338 is explicitly motivated by an IA-64 feature that applies only to
> CPU registers. An object whose address is taken can't be stored (only)
> in a register, so it can't have a NaT representation.
>
> The phrase used is "could have been declared with register storage class
> (never had its address taken)". Surely "never had its address taken"
> would have been clear enough if CPU registers weren't a big part of the
> motivation.
>
I too think the phrasing is a bit odd.
Just because a variable's address is taken, does not mean it cannot be
put in a cpu register by the compiler. If the variable is not accessed
in a way that actually requires putting it in memory, then the compiler
can put it in a cpu register (or otherwise optimise it). So simply
taking the address of a variable on IA-64 does not mean it cannot be in
a register, and thus does not necessarily mean it cannot be NaT. Taking
the address of a variable means the variable cannot be declared
"register", but it does not mean it cannot be /in/ a register.
It seems very strange to me that this is UB:
int foo1(void) {
int x;
return x;
}
while this is not :
int foo2(void) {
int x;
int * p = &x;
return x;
}
(Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler in
its list.)
It strikes me that it would have been far simpler for the standard
simply to say that using the value of an uninitialised and unassigned
variable is undefined behaviour.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2025-03-20 12:46 -0700 |
| Subject | Re: int a = a |
| Message-ID | <87msdfscxj.fsf@nosuchdomain.example.com> |
| In reply to | #391414 |
David Brown <david.brown@hesbynett.no> writes:
> On 20/03/2025 11:20, Keith Thompson wrote:
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>> The "could have been declared with the register storage class"
>>>> seems quite odd. And in fact it is quite odd.
>>>
>>> I don't have the same reaction. The point of this phrase is that
>>> undefined behavior occurs only for variables that don't have
>>> their address taken. The phrase used describes that nicely.
>>> Any questions related to "registerness" can be ignored, because
>>> 'register' in C really has nothing to do with hardware registers,
>>> despite the name.
>> DR 338 is explicitly motivated by an IA-64 feature that applies only
>> to
>> CPU registers. An object whose address is taken can't be stored (only)
>> in a register, so it can't have a NaT representation.
>> The phrase used is "could have been declared with register storage
>> class
>> (never had its address taken)". Surely "never had its address taken"
>> would have been clear enough if CPU registers weren't a big part of the
>> motivation.
>
> I too think the phrasing is a bit odd.
>
> Just because a variable's address is taken, does not mean it cannot be
> put in a cpu register by the compiler. If the variable is not
> accessed in a way that actually requires putting it in memory, then
> the compiler can put it in a cpu register (or otherwise optimise it).
> So simply taking the address of a variable on IA-64 does not mean it
> cannot be in a register, and thus does not necessarily mean it cannot
> be NaT. Taking the address of a variable means the variable cannot be
> declared "register", but it does not mean it cannot be /in/ a
> register.
Sure, any variable that's stored in memory can be mirrored by holding
its value in a register.
int n = 42; // Assume n is assigned a memory address
printf("n+1=%d n+2=%d\n", n+1, n+2);
A compiler could plausibly store the value of n in a register before
computing n+1, and then reuse the register value to compute n+2.
My understanding is that IA-64 NaT (Not a Thing) representations
exist only for registers, and the NaT bit should be cleared when
a value is stored in the register.
The odd wording in the standard allows an IA-64 C compiler to
take advantage of NaT representations for their intended purpose.
It might impose some minor constraints on what machine code can be
generated, but *most* of the cases where a NaT could be accessed
are undefined behavior in C.
> It seems very strange to me that this is UB:
>
> int foo1(void) {
> int x;
>
> return x;
> }
>
> while this is not :
>
> int foo2(void) {
> int x;
>
> int * p = &x;
>
> return x;
> }
>
> (Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler
> in its list.)
>
> It strikes me that it would have been far simpler for the standard
> simply to say that using the value of an uninitialised and unassigned
> variable is undefined behaviour.
In C90, it was. C99 changed that, making the behavior defined if the
representation is not a trap representation.
For C99, a conforming IA-64 C compiler would have had to go out of its
way to avoid accessing NaT representations. For example, if you wrote
{
int n;
n;
}
the most straightforward IA-64 code would store n in a register and
not initialize it, resulting in a trap when the register is read.
A compiler might have to generate code to store an arbitrary value
in the register to void the trap.
I'm undecided on whether reading the value of an uninitialized
automatic object *should* be undefined behavior, but given that
it isn't, the C11 committee made the smallest possible change to
cater to IA-64 semantics.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2025-03-21 10:44 +0100 |
| Subject | Re: int a = a |
| Message-ID | <vrjcd5$18m5n$1@dont-email.me> |
| In reply to | #391438 |
On 20/03/2025 20:46, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 20/03/2025 11:20, Keith Thompson wrote:
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>>> The "could have been declared with the register storage class"
>>>>> seems quite odd. And in fact it is quite odd.
>>>>
>>>> I don't have the same reaction. The point of this phrase is that
>>>> undefined behavior occurs only for variables that don't have
>>>> their address taken. The phrase used describes that nicely.
>>>> Any questions related to "registerness" can be ignored, because
>>>> 'register' in C really has nothing to do with hardware registers,
>>>> despite the name.
>>> DR 338 is explicitly motivated by an IA-64 feature that applies only
>>> to
>>> CPU registers. An object whose address is taken can't be stored (only)
>>> in a register, so it can't have a NaT representation.
>>> The phrase used is "could have been declared with register storage
>>> class
>>> (never had its address taken)". Surely "never had its address taken"
>>> would have been clear enough if CPU registers weren't a big part of the
>>> motivation.
>>
>> I too think the phrasing is a bit odd.
>>
>> Just because a variable's address is taken, does not mean it cannot be
>> put in a cpu register by the compiler. If the variable is not
>> accessed in a way that actually requires putting it in memory, then
>> the compiler can put it in a cpu register (or otherwise optimise it).
>> So simply taking the address of a variable on IA-64 does not mean it
>> cannot be in a register, and thus does not necessarily mean it cannot
>> be NaT. Taking the address of a variable means the variable cannot be
>> declared "register", but it does not mean it cannot be /in/ a
>> register.
>
> Sure, any variable that's stored in memory can be mirrored by holding
> its value in a register.
>
> int n = 42; // Assume n is assigned a memory address
> printf("n+1=%d n+2=%d\n", n+1, n+2);
>
> A compiler could plausibly store the value of n in a register before
> computing n+1, and then reuse the register value to compute n+2.
Yes, of course. But there is also no necessity for variables to be in
memory at all, or that there is any consistency there. "Assume n is
assigned a memory address" is a completely unwarranted assumption for
almost all local variables. It is only if the address is taken, and
used in some way that is beyond the optimiser, that the variable
actually has to go in a fixed place in memory. Otherwise optimisers can
and do keep data in registers, or move them in and out of registers and
different stack slots according to convenience for efficient code.
uint32_t float_to_uint(float f) {
uint32_t u;
memcpy(&u, &f, 4);
return u;
}
gcc compiles that to :
float_to_uint:
movd eax, xmm0
ret
So even though the addresses of the variable "u" and the parameter "f"
are taken, and converted to char pointers, and passed to a function with
external linkage, nothing is actually put in memory at all.
Thus the standard's wording as though the legality of using the
"register" storage-class specifier corresponds to cpu register usage is,
at best, wildly out of date.
(And there are some architectures where the cpu registers are directly
mapped to memory, and can be accessed as memory locations or registers.)
>
> My understanding is that IA-64 NaT (Not a Thing) representations
> exist only for registers, and the NaT bit should be cleared when
> a value is stored in the register.
>
> The odd wording in the standard allows an IA-64 C compiler to
> take advantage of NaT representations for their intended purpose.
> It might impose some minor constraints on what machine code can be
> generated, but *most* of the cases where a NaT could be accessed
> are undefined behavior in C.
>
I see that, but I believe it would be much simpler and clearer if
attempting to read an uninitialised and unassigned local variable were
undefined behaviour in every case.
Alternatively, it could have said that the value is unspecified in every
case. Then on the IA-64, the compiler would have to ensure that
registers do not have their NaT bit set even if they are not initialised
- this would not be a difficult task. Enabling use of the NaT bit for
detection of bugs could then be a compiler option if implementations
wanted to provide that feature.
>> It seems very strange to me that this is UB:
>>
>> int foo1(void) {
>> int x;
>>
>> return x;
>> }
>>
>> while this is not :
>>
>> int foo2(void) {
>> int x;
>>
>> int * p = &x;
>>
>> return x;
>> }
>>
>> (Unfortunately, godbolt.org doesn't seem to have a gcc IA-64 compiler
>> in its list.)
>>
>> It strikes me that it would have been far simpler for the standard
>> simply to say that using the value of an uninitialised and unassigned
>> variable is undefined behaviour.
>
> In C90, it was. C99 changed that, making the behavior defined if the
> representation is not a trap representation.
>
> For C99, a conforming IA-64 C compiler would have had to go out of its
> way to avoid accessing NaT representations. For example, if you wrote
>
> {
> int n;
> n;
> }
>
> the most straightforward IA-64 code would store n in a register and
> not initialize it, resulting in a trap when the register is read.
> A compiler might have to generate code to store an arbitrary value
> in the register to void the trap.
>
> I'm undecided on whether reading the value of an uninitialized
> automatic object *should* be undefined behavior, but given that
> it isn't, the C11 committee made the smallest possible change to
> cater to IA-64 semantics.
>
IMHO, having it as UB is the best option, with unspecified behaviour as
a second best option. The jumble that C11 has is not necessary for the
IA-64, and clearly worse than the other two choices for architectures
that don't have a NaT equivalent.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2025-03-21 12:23 -0700 |
| Subject | Re: int a = a |
| Message-ID | <87pliaqjb9.fsf@nosuchdomain.example.com> |
| In reply to | #391471 |
David Brown <david.brown@hesbynett.no> writes:
> On 20/03/2025 20:46, Keith Thompson wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>>> On 20/03/2025 11:20, Keith Thompson wrote:
>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>>>> The "could have been declared with the register storage class"
>>>>>> seems quite odd. And in fact it is quite odd.
>>>>>
>>>>> I don't have the same reaction. The point of this phrase is that
>>>>> undefined behavior occurs only for variables that don't have
>>>>> their address taken. The phrase used describes that nicely.
>>>>> Any questions related to "registerness" can be ignored, because
>>>>> 'register' in C really has nothing to do with hardware registers,
>>>>> despite the name.
>>>> DR 338 is explicitly motivated by an IA-64 feature that applies only
>>>> to
>>>> CPU registers. An object whose address is taken can't be stored (only)
>>>> in a register, so it can't have a NaT representation.
>>>> The phrase used is "could have been declared with register storage
>>>> class
>>>> (never had its address taken)". Surely "never had its address taken"
>>>> would have been clear enough if CPU registers weren't a big part of the
>>>> motivation.
>>>
>>> I too think the phrasing is a bit odd.
>>>
>>> Just because a variable's address is taken, does not mean it cannot be
>>> put in a cpu register by the compiler. If the variable is not
>>> accessed in a way that actually requires putting it in memory, then
>>> the compiler can put it in a cpu register (or otherwise optimise it).
>>> So simply taking the address of a variable on IA-64 does not mean it
>>> cannot be in a register, and thus does not necessarily mean it cannot
>>> be NaT. Taking the address of a variable means the variable cannot be
>>> declared "register", but it does not mean it cannot be /in/ a
>>> register.
>> Sure, any variable that's stored in memory can be mirrored by
>> holding
>> its value in a register.
>> int n = 42; // Assume n is assigned a memory address
>> printf("n+1=%d n+2=%d\n", n+1, n+2);
>> A compiler could plausibly store the value of n in a register before
>> computing n+1, and then reuse the register value to compute n+2.
>
> Yes, of course. But there is also no necessity for variables to be in
> memory at all, or that there is any consistency there. "Assume n is
> assigned a memory address" is a completely unwarranted assumption for
> almost all local variables.
I think you misunderstood what I meant by "assume". Certainly n could
be assigned a memory address. You can read it as "*IF* n is assigned a
memory address, then ...". I was asserting that it has a memory address
for purposes of the discussion, not presuming that it must actually have
one.
> It is only if the address is taken, and
> used in some way that is beyond the optimiser, that the variable
> actually has to go in a fixed place in memory. Otherwise optimisers
> can and do keep data in registers, or move them in and out of
> registers and different stack slots according to convenience for
> efficient code.
>
>
> uint32_t float_to_uint(float f) {
> uint32_t u;
> memcpy(&u, &f, 4);
> return u;
> }
>
> gcc compiles that to :
>
> float_to_uint:
> movd eax, xmm0
> ret
>
> So even though the addresses of the variable "u" and the parameter "f"
> are taken, and converted to char pointers, and passed to a function
> with external linkage, nothing is actually put in memory at all.
>
> Thus the standard's wording as though the legality of using the
> "register" storage-class specifier corresponds to cpu register usage
> is, at best, wildly out of date.
>
> (And there are some architectures where the cpu registers are directly
> mapped to memory, and can be accessed as memory locations or
> registers.)
>
>> My understanding is that IA-64 NaT (Not a Thing) representations
>> exist only for registers, and the NaT bit should be cleared when
>> a value is stored in the register.
>> The odd wording in the standard allows an IA-64 C compiler to
>> take advantage of NaT representations for their intended purpose.
>> It might impose some minor constraints on what machine code can be
>> generated, but *most* of the cases where a NaT could be accessed
>> are undefined behavior in C.
>
> I see that, but I believe it would be much simpler and clearer if
> attempting to read an uninitialised and unassigned local variable were
> undefined behaviour in every case.
I probably agree (I haven't given it all that much thought), but the
committee made a specific decision between C90 and C99 to say that
reading an uninitialized automatic object is *not* undefined behavior.
I'm don't know why they did that (though, all else being equal, reducing
the number of instances of undefined behavior is a good thing), but
reversing that decision for this one issue is not something they decided
to do.
> Alternatively, it could have said that the value is unspecified in
> every case. Then on the IA-64, the compiler would have to ensure that
> registers do not have their NaT bit set even if they are not
> initialised - this would not be a difficult task. Enabling use of the
> NaT bit for detection of bugs could then be a compiler option if
> implementations wanted to provide that feature.
The whole point of the NaT bit is to detect accesses to uninitialized
values. Requiring the compiler to arbitrarily clear that bit
doesn't strike me as a good idea.
I dislike the way that wording was added to the standard specifically
to cater to one specific CPU (which happens to have been discontinued
later). I would have been happier with a more general solution.
I that making accessing the value of an uninitialized automatic
object UB would have been much cleaner, and it would have allowed for
sensible use of NaT by IA-64 compilers. But without knowing *why*
the committee removed that UB between C90 and C99, I'm hesitant to
say it was a mistake.
Meanwhile, I will in effect assume that accessing uninitialized objects
is UB, i.e., I'll carefully avoid doing so.
[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2025-03-21 21:46 +0100 |
| Subject | Re: int a = a |
| Message-ID | <vrkj78$29u2c$1@dont-email.me> |
| In reply to | #391480 |
On 21/03/2025 20:23, Keith Thompson wrote: > David Brown <david.brown@hesbynett.no> writes: >> I see that, but I believe it would be much simpler and clearer if >> attempting to read an uninitialised and unassigned local variable were >> undefined behaviour in every case. > > I probably agree (I haven't given it all that much thought), but the > committee made a specific decision between C90 and C99 to say that > reading an uninitialized automatic object is *not* undefined behavior. > I'm don't know why they did that (though, all else being equal, reducing > the number of instances of undefined behavior is a good thing), but > reversing that decision for this one issue is not something they decided > to do. Certainly the C committee have to think harder, and consider more possibilities than most mere C programmers are likely to do - and they don't like to make something "undefined" if it were defined (to at least some extent) previously. I can agree that it is good to reduce the number of UB's, all else being equal - but all else is very seldom equal. To me, it is preferable to say clearly and explicitly "this is undefined behaviour" than to leave the C programmer to combine several parts of the standard to figure out that the construct might do something defined but unspecified, or might do something bad (a trap), or might be defined or undefined depending on other mostly unrelated code. I am a fan of clear undefined behaviour when there is no good definition of what the behaviour should be - I'd rather have UB than badly defined behaviour. But I strongly prefer it to be explicit and clear. > >> Alternatively, it could have said that the value is unspecified in >> every case. Then on the IA-64, the compiler would have to ensure that >> registers do not have their NaT bit set even if they are not >> initialised - this would not be a difficult task. Enabling use of the >> NaT bit for detection of bugs could then be a compiler option if >> implementations wanted to provide that feature. > > The whole point of the NaT bit is to detect accesses to uninitialized > values. Requiring the compiler to arbitrarily clear that bit > doesn't strike me as a good idea. > I think it would be fine to do that - as long as tools also provide modes that don't clear the bit so that you can use the feature for debugging or run-time checks. If the behaviour here was to make the variable an unspecified value, then in fully compliant modes the compiler would have to clear the NaT bit, so the compiler mode making use of the NaT bit would be marginally non-compliant. But I see no problem with that - developers can happily use slightly non-compliant mode in order to get more features (language extensions, faster execution, better debugging - whatever suits the user and the compiler implementer). Still, leaving it undefined behaviour would be even better, because then compilers could have flags for using the NaT bit, or clearing the variable to 0, or giving a compile-time error - whatever they did it would still be compliant. > I dislike the way that wording was added to the standard specifically > to cater to one specific CPU (which happens to have been discontinued > later). I would have been happier with a more general solution. > I that making accessing the value of an uninitialized automatic > object UB would have been much cleaner, and it would have allowed for > sensible use of NaT by IA-64 compilers. But without knowing *why* > the committee removed that UB between C90 and C99, I'm hesitant to > say it was a mistake. > > Meanwhile, I will in effect assume that accessing uninitialized objects > is UB, i.e., I'll carefully avoid doing so. > That, I think is the best way to handle this.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2025-03-22 13:59 -0700 |
| Subject | Re: int a = a |
| Message-ID | <86a59clr3a.fsf@linuxsc.com> |
| In reply to | #391480 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
> David Brown <david.brown@hesbynett.no> writes:
>
>> [...]I believe it would be much simpler and clearer if attempting
>> to read an uninitialised and unassigned local variable were
>> undefined behaviour in every case.
>
> I probably agree (I haven't given it all that much thought), but
> the committee made a specific decision between C90 and C99 to say
> that reading an uninitialized automatic object is *not* undefined
> behavior. I'm don't know why they did that (though, all else
> being equal, reducing the number of instances of undefined
> behavior is a good thing), but reversing that decision for this
> one issue is not something they decided to do.
Your description of what was done is wrong. It is still the case in
C99 that trying to access an uninitialized object is undefined
behavior, at least potentially, except for accesses using a type
that either is a character type or has no trap representations (and
all types other than unsigned char may have trap representations,
depending on the implementation). A statement like
int a = a;
may still be given a warning as potential undefined behavior, even
in C99.
>> Alternatively, it could have said that the value is unspecified
>> in every case. Then on the IA-64, the compiler would have to
>> ensure that registers do not have their NaT bit set even if they
>> are not initialised - this would not be a difficult task.
>> Enabling use of the NaT bit for detection of bugs could then be a
>> compiler option if implementations wanted to provide that
>> feature.
>
> The whole point of the NaT bit is to detect accesses to
> uninitialized values. Requiring the compiler to arbitrarily clear
> that bit doesn't strike me as a good idea.
>
> I dislike the way that wording was added to the standard
> specifically to cater to one specific CPU (which happens to have
> been discontinued later). I would have been happier with a more
> general solution. I that making accessing the value of an
> uninitialized automatic object UB would have been much cleaner,
> and it would have allowed for sensible use of NaT by IA-64
> compilers.
I think you may be missing a key point. In both C99 and C11,
accessing an uninitialized object using a character type is defined
(albeit unspecified) behavior. But in C11, because of the changes
in 6.3.2.1, even character types are subject to undefined behavior
when uninitialized objects are accessed (provided of course that
they don't fall under an exception because their address was taken).
The C11 rule does more than allowing undefined behavior just for
non-character types; it extends the possibility of undefined
behavior to character types as well.
> But without knowing *why* the committee removed that UB between
> C90 and C99, I'm hesitant to say it was a mistake.
The mistake is thinking that UB for uninitialized access was
removed in C99. It wasn't. Narrowed, yes; removed, no. And
later C11 simply widened it back a bit, recovering some of the
territory that had been taken away in C99. The Itanium may
have been what prompted the change, but the change that was made
is one well worth making.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2025-03-22 15:37 -0700 |
| Subject | Re: int a = a |
| Message-ID | <87ldsw8zer.fsf@nosuchdomain.example.com> |
| In reply to | #391524 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> David Brown <david.brown@hesbynett.no> writes:
>>> [...]I believe it would be much simpler and clearer if attempting
>>> to read an uninitialised and unassigned local variable were
>>> undefined behaviour in every case.
>>
>> I probably agree (I haven't given it all that much thought), but
>> the committee made a specific decision between C90 and C99 to say
>> that reading an uninitialized automatic object is *not* undefined
>> behavior. I'm don't know why they did that (though, all else
>> being equal, reducing the number of instances of undefined
>> behavior is a good thing), but reversing that decision for this
>> one issue is not something they decided to do.
>
> Your description of what was done is wrong. It is still the case in
> C99 that trying to access an uninitialized object is undefined
> behavior, at least potentially, except for accesses using a type
> that either is a character type or has no trap representations (and
> all types other than unsigned char may have trap representations,
> depending on the implementation). A statement like
>
> int a = a;
>
> may still be given a warning as potential undefined behavior, even
> in C99.
I had already mentioned that distinction earlier in the thread.
[...]
> The mistake is thinking that UB for uninitialized access was
> removed in C99. It wasn't. Narrowed, yes; removed, no.
Acknowledged.
[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2025-04-28 09:39 -0700 |
| Subject | Re: int a = a |
| Message-ID | <86frhs8ckt.fsf@linuxsc.com> |
| In reply to | #391529 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: >> >>> David Brown <david.brown@hesbynett.no> writes: >>> >>>> [...]I believe it would be much simpler and clearer if attempting >>>> to read an uninitialised and unassigned local variable were >>>> undefined behaviour in every case. >>> >>> I probably agree (I haven't given it all that much thought), but >>> the committee made a specific decision between C90 and C99 to say >>> that reading an uninitialized automatic object is *not* undefined >>> behavior. I'm don't know why they did that (though, all else >>> being equal, reducing the number of instances of undefined >>> behavior is a good thing), but reversing that decision for this >>> one issue is not something they decided to do. >> >> Your description of what was done is wrong. It is still the case in >> C99 that trying to access an uninitialized object is undefined >> behavior, at least potentially, except for accesses using a type >> that either is a character type or has no trap representations (and >> all types other than unsigned char may have trap representations, >> depending on the implementation). A statement like >> >> int a = a; >> >> may still be given a warning as potential undefined behavior, even >> in C99. > > I had already mentioned that distinction earlier in the thread. Oh, I must have missed that. I don't remember seeing it in the message I was replying to. >> The mistake is thinking that UB for uninitialized access was >> removed in C99. It wasn't. Narrowed, yes; removed, no. > > Acknowledged. Good deal.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2025-04-29 13:12 -0700 |
| Subject | Re: int a = a |
| Message-ID | <86bjse917i.fsf@linuxsc.com> |
| In reply to | #391391 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > >> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: >> [how to indicate a variable not being used is okay] >> [some quoted text rearranged] >> >>> Unless I'm missing something, `(void)x` also has undefined beahvior >>> if x is uninitialized, >> >> Right. Using (void)&x is better. > > I'm not convinced -- and it's far less idiomatic. Both phrases are idiomatic. What you mean is one phrase is more common than the other. More common doesn't mean better. Recall Dijkstra's dictum, not to conclude that something is more convenient just because it's more conventional. > I don't think > I've ever seen (void)&x in code, and if I did I'd wonder what the > author's intent was. The same is true for any construction seen for the first time, and like other such cases either you would figure it out or look/ask around to find out. And then you'd know. Furthermore, having gotten the benefit of this discussion, you wouldn't have to do that, because you've seen it already. > (void)x is a common idiom for hinting to the compiler that it > doesn't need to complain about x being unused. (void)&x doesn't > tell the compiler that the *value* of x is used. I'm not sure how > much difference that makes. Both have the effect of getting rid of the warning even if placed after a 'return' statement so as not to be executed. > Even with (void)x and/or (void)&x, a compiler *could* still warn > about x being unused, or about the programmer's use of an ugly font. Yes, it could. At such time that it happens I expect I would react and adapt accordingly, the same as with all questionable compiler behaviors. >>> though it's very likely to do nothing in practice. >> >> Unless x is volatile qualified, in which there must be an access >> to x in the generated code. >> >>> The behavior [of int a = a;] is undefined. In C11 and later >>> (N1570 6.3.2.1p2): >>> >>> Except when [...] an lvalue that does not have array type is >>> converted to the value stored in the designated object (and is >>> no longer an lvalue); this is called lvalue conversion. >>> [...] >>> If the lvalue designates an object of automatic storage >>> duration that could have been declared with the register >>> storage class (never had its address taken), and that object >>> is uninitialized (not declared with an initializer and no >>> assignment to it has been performed prior to use), the >>> behavior is undefined. >>> >>> Long digression follows. >>> >>> The "could have been declared with the register storage class" >>> seems quite odd. And in fact it is quite odd. >> >> I don't have the same reaction. The point of this phrase is that >> undefined behavior occurs only for variables that don't have >> their address taken. The phrase used describes that nicely. >> Any questions related to "registerness" can be ignored, because >> 'register' in C really has nothing to do with hardware registers, >> despite the name. > > DR 338 is explicitly motivated by an IA-64 feature that applies only to > CPU registers. An object whose address is taken can't be stored (only) > in a register, so it can't have a NaT representation. > > The phrase used is "could have been declared with register storage class > (never had its address taken)". Surely "never had its address taken" > would have been clear enough if CPU registers weren't a big part of the > motivation. I'm surprised you would say this. The phrase "never had its address taken" doesn't satisfy the careful language threshold observed in the ISO C standard. Do you really not understand this? >>> https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm >>> >>> So the "could have been declared with the register storage class" >>> wording was added in C11 specifically to cater to the IA64. This >>> change would have been superfluous in C90, where the behavior was >>> undefined anyway, but is a semantically significant change between >>> C99 and C11. (If some future CPU has something like NaT that can >>> be stored in memory, the wording might need to be updated yet again.) >>> >>> My takeaway is that if it requires this much research to determine >>> whether accessing the value of an uninitialized object has undefined >>> behavior (in which circumstances and which edition of the standard), >>> I'll just avoid doing so altogether. I'll initialize objects >>> when they're defined whenever practical. If it's not practical >>> for some reason, I won't initialize it with some dummy value; I'll >>> leave it uninitialized so the compiler has a chance to warn me if >>> I accidentally use it before assigning a value to it. >> >> I think you are overthinking the question. In cases where it's >> important to give an initial value to a variable, and can be done >> so at the point of its declaration, use an initializer; otherwise >> don't. > > My overthinking led me to essentially the same conclusion, so I don't > see the problem. And I also found it to be an interesting exploration > of how certain aspects of the C standard have evolved over time. Doing more thinking than is needed is a waste of effort. I can only hope that you have better things to do with your time. Furthermore spending any time dwelling on the Itanium being the motivation for the change is just a distraction. It was interesting to learn, but having learned it there is no need to consider it further. >> We don't have to read several different C standards, or >> even only one, to reach that conclusion. > > No, but we do have to read one or more C standards to counter an > argument that `int a = a;` is well defined. Only if one feels it necessary to convince someone who holds such an uneducated view. I don't mind pointing someone in the right direction, but it's not my job to convince them. >> If someone wants to know >> exactly which border cases are safe and which cases are not, then >> reading the relevant version(s) of the C standard is needed, but >> in most situations it isn't. It's important for the C standard to >> be precise about what it prescribes, but as far as initialization >> goes it's easy to write code that doesn't need that level of >> detail. Compiler writers need to know such things; in the >> particular case of when and where to initialize, most developers >> don't. > > Most developers don't read this newsgroup. Probably true, but there plenty of places where one can find out these things besides comp.lang.c.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2025-04-29 13:34 -0700 |
| Subject | Re: int a = a |
| Message-ID | <87plguzozg.fsf@nosuchdomain.example.com> |
| In reply to | #393053 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>> [how to indicate a variable not being used is okay]
>>> [some quoted text rearranged]
>>>
>>>> Unless I'm missing something, `(void)x` also has undefined beahvior
>>>> if x is uninitialized,
>>>
>>> Right. Using (void)&x is better.
>>
>> I'm not convinced -- and it's far less idiomatic.
>
> Both phrases are idiomatic. What you mean is one phrase is more
> common than the other. More common doesn't mean better. Recall
> Dijkstra's dictum, not to conclude that something is more convenient
> just because it's more conventional.
[...]
Just so you're aware, I've read your post and I have nothing more
to say about it.
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2025-03-20 15:42 +0100 |
| Subject | Re: int a = a |
| Message-ID | <vrh9fu$3e7sn$1@dont-email.me> |
| In reply to | #391374 |
On 19/03/2025 21:34, Keith Thompson wrote: > David Brown <david.brown@hesbynett.no> writes: > [...] >> As far as I understand it (and I hope to be corrected if I am wrong), > > Your hope is about to be fulfilled. > >> "int a = a;" is not undefined behaviour as long as the implementation >> does not have trap values for "int". It simply leaves "a" as an >> unspecified value - just like "int a;" does. Thus it is not in any >> way "worse" than "int a;" as far as C semantics are concerned. Any >> difference is a matter of implementation - and the usual >> implementation effect is to disable "not initialised" warnings. > > The behavior is undefined. In C11 and later (N1570 6.3.2.1p2): > > Except when [...] an lvalue that does not have array type is > converted to the value stored in the designated object (and is no > longer an lvalue); this is called lvalue conversion. > [...] > If the lvalue designates an object of automatic storage duration that > could have been declared with the register storage class (never had > its address taken), and that object is uninitialized (not declared > with an initializer and no assignment to it has been performed prior > to use), the behavior is undefined. > OK. I had missed that for some reason. Elsewhere (6.7.9p10, under "initialization") the standard says the value is "indeterminate", which is defined as an "unspecified or trap" value. >> It is in much the same category as "(void) x;", which is an idiom for >> skipping an "unused variable" or "unused parameter" warning. > > Unless I'm missing something, `(void)x` also has undefined beahvior > if x is uninitialized, though it's very likely to do nothing in > practice. The situation where "(void) x;" is most useful is, I would say, unused parameters. So there is no undefined behaviour there. And for other variables it is most likely in situations where you have assigned to the variable but then don't use it (perhaps you plan to use it later). Maybe you have "status = do_something();", and then don't actually make use of "status" - casting it to void tells both the compiler and the reader that you know "do_something()" is returning a status indicator, but that you are then ignoring it. If you are simply declaring a variable without initialising it and you don't want to use it and don't want to be warned about it, it's probably just as easy (and definitely avoids UB) to remove the declaration. > > Long digression follows. > > The "could have been declared with the register storage class" seems > quite odd. And in fact it is quite odd. > > It's tempting to assume that `int n = n;` did not have undefined > behavior prior to C11, or that accessing an automatic object whose > address has not been taken does not have undefined behavior even > in C11 or later, but it's not that simple. > > In C90, the non-normative Annex G (renamed to Annex J in later > editions) says: > > The behavior in the following circumstances is undefined: > [...] > - The value of an uninitialized object that has automatic storage > duration is used before a value is assigned (6.5.7). > > 6.5.7 discusses initialization, and says that "If an object that > has automatic storage duration is not initialized explicitly, its > value is indeterminate", and C90's definition of "undefined behavior" > explicitly refers to use of indeterminately valued objects, though > it's not 100% clear that using an indeterminate value *always* > has undefined behavior. > > So in C90, `int n = n;` explicitly had undefined behavior, even if > all possible bit representations for an object of type int correspond > to valid values (C90 didn't mention "trap representations"). > > C99 added a definition for "indeterminate value": "either an > unspecified value or a trap representation", and drops the mention > of indeterminate values in the definition of "undefined behavior". > It dropped the reference to uninitialized objects in Annex G/J. > I believe that in C99, `int n = n;` is well defined *if* int > has no trap representations, or if the representation stored in > the memory occupied by n happens not to be a trap representation. > If int has trap representations, and that memory happens to contain > such a representation, the behavior is undefined. > > I found a discussion in comp.std.c from 2023, subject "Does reading > an uninitialized object have undefined behavior?". > > The discontinued IA-64/Itanium processor had something called > "NaT", "Not a Thing". NaT representations exist only in CPU > registers, not in memory. (Imagine an extra bit for each register > indicating whether the register contains a "thing".) A NaT allows > for representations that act like C trap representations (called > non-value representations in C23) even for types with no trap > representations (for example where all 2**N possible representations > correspond to valid values) -- but again, only in CPU registers. > > https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm > > So the "could have been declared with the register storage class" > wording was added in C11 specifically to cater to the IA64. This > change would have been superfluous in C90, where the behavior was > undefined anyway, but is a semantically significant change between > C99 and C11. (If some future CPU has something like NaT that can > be stored in memory, the wording might need to be updated yet again.) > > My takeaway is that if it requires this much research to determine > whether accessing the value of an uninitialized object has undefined > behavior (in which circumstances and which edition of the standard), > I'll just avoid doing so altogether. I'll initialize objects > when they're defined whenever practical. If it's not practical > for some reason, I won't initialize it with some dummy value; I'll > leave it uninitialized so the compiler has a chance to warn me if > I accidentally use it before assigning a value to it. > Thanks for that explanation. My opinions here match your "takeaway" entirely. Just because I have seen "int a = a;", and know how gcc (and perhaps other compilers) handle it, does not mean I think it is a good thing to write!
[toc] | [prev] | [next] | [standalone]
| From | scott@slp53.sl.home (Scott Lurndal) |
|---|---|
| Date | 2025-03-18 19:37 +0000 |
| Subject | Re: int a = a (Was: Bart's Language) |
| Message-ID | <5YjCP.354514$8sk5.142044@fx02.iad> |
| In reply to | #391310 |
gazelle@shell.xmission.com (Kenny McCormack) writes: >In article <vrc75b$2r4lt$1@dont-email.me>, >David Brown <david.brown@hesbynett.no> wrote: >... >>> gcc won't warn until you say '-Wextra', and then only for: >> > >> > int a = a + 1; >> >>People would not normally write "int a = a;". It is used as a common >>idiom meaning "I know it is not clear to the compiler that the variable >>is always initialised before use, but /I/ know it is - so disable the >>use-without-initialisation warnings for this variable". So it makes >>perfect sense for the compiler not to warn about it! > >Wouldn't it just be easier and clearer to write: int a = 0; >and be done with it? Would cost an additional instruction at least... I've never seen the construct 'int a = a;' ever used, myself. I'll pay the extra instruction for a deterministic value.
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2025-03-18 20:51 +0100 |
| Subject | Re: int a = a (Was: Bart's Language) |
| Message-ID | <vrcisg$35ffo$1@dont-email.me> |
| In reply to | #391310 |
On 18/03/2025 19:04, Kenny McCormack wrote: > In article <vrc75b$2r4lt$1@dont-email.me>, > David Brown <david.brown@hesbynett.no> wrote: > ... >>> gcc won't warn until you say '-Wextra', and then only for: >>> >>> int a = a + 1; >> >> People would not normally write "int a = a;". It is used as a common >> idiom meaning "I know it is not clear to the compiler that the variable >> is always initialised before use, but /I/ know it is - so disable the >> use-without-initialisation warnings for this variable". So it makes >> perfect sense for the compiler not to warn about it! > > Wouldn't it just be easier and clearer to write: int a = 0; > and be done with it? Write that if that's what you want. I don't think I have ever actually written "int a = a;" in my own code - but I know the idiom. In almost all cases, I don't declare a variable until I have something to put in it, so I have a real initialiser. And if I don't, I prefer to leave it uninitialised - then the compiler can tell me if I haven't initialised it when I use it. "int a = a;" would only be useful in fairly niche cases. > >> "int a = a + 1;", on the other hand, clearly attempts to read the value >> of "a" before it is initialised, and a warning is issued if >> "-Wuninitialized" is enabled. This warning is part of "-Wall". > > How is: int a = a + 1; > conceptually different from: int a = a; > > Both are expressions involving 'a'. > Isn't 'a' being used un-initialised in both cases? > The case "int a = a;" is recognised as an idiom by a number of compilers (such as gcc). A brief check suggests that gcc will generate code as it would for "int a = 0;", but it is certainly possible for a compiler to avoid any kind of initialisation here and let the register or stack slot used for "a" stay as it was. That would be a pretty minor efficiency improvement, but optimised code is mostly the sum of lots of tiny improvements. > (You have to know the value of 'a' in order to evaluate the expression: a) >
[toc] | [prev] | [next] | [standalone]
| From | Janis Papanagnou <janis_papanagnou+ng@hotmail.com> |
|---|---|
| Date | 2025-03-18 23:27 +0100 |
| Subject | Re: int a = a (Was: Bart's Language) |
| Message-ID | <vrcs09$3ejvg$1@dont-email.me> |
| In reply to | #391315 |
On 18.03.2025 20:51, David Brown wrote: > [...] A brief check suggests that gcc will generate code as it > would for "int a = 0;", but it is certainly possible for a compiler to > avoid any kind of initialisation here and let the register or stack slot > used for "a" stay as it was. That would be a pretty minor efficiency > improvement, > but optimised code is mostly the sum of lots of tiny improvements. Interesting view. I've learned that such Peephole Optimizations were not what contribute to optimizations most. It's rather transformations of structure of various forms that is what "mostly" matters. - That's what I've learned many decades ago, of course. - So I'm curious where you've got that view from. (Some reference, maybe? Or was that just a personal opinion?) Janis, wondering
[toc] | [prev] | [next] | [standalone]
| From | David Brown <david.brown@hesbynett.no> |
|---|---|
| Date | 2025-03-19 11:40 +0100 |
| Subject | Re: int a = a (Was: Bart's Language) |
| Message-ID | <vre6v0$lb74$1@dont-email.me> |
| In reply to | #391318 |
On 18/03/2025 23:27, Janis Papanagnou wrote: > On 18.03.2025 20:51, David Brown wrote: >> [...] A brief check suggests that gcc will generate code as it >> would for "int a = 0;", but it is certainly possible for a compiler to >> avoid any kind of initialisation here and let the register or stack slot >> used for "a" stay as it was. That would be a pretty minor efficiency >> improvement, > >> but optimised code is mostly the sum of lots of tiny improvements. > > Interesting view. I've learned that such Peephole Optimizations were > not what contribute to optimizations most. It's rather transformations > of structure of various forms that is what "mostly" matters. - That's > what I've learned many decades ago, of course. - So I'm curious where > you've got that view from. (Some reference, maybe? Or was that just a > personal opinion?) > Some optimisations have big effects, certainly - good register allocation and lifetime analysis, and optimisations that move code around (loop transformations, inlining, etc.) are the big factors. However, in modern compilers there are lots of minor optimisations that only apply in a few cases and only help a few percent in those cases. Each does little on its own, but in sum the results can be significant. But you are right that it is not really fair to say that optimisation is "mostly" the sum of tiny improvements - it's a small number of big and important transforms, and /then/ the sum of a large number of small ones. One complicating factor about these small optimisations is that the observable effect on code speed is highly dependent on the rest of the code and the type of processor involved. A peephole optimisation that removes an extra register-to-register move will save a cycle on a microcontroller, but on an x86 system such a move might be merged in the register renaming hardware of the cpu's prefetch queues and thus disappear entirely. Reducing instruction cycles matters a lot on microcontrollers, while on a big processor they might not make any difference if the cpu is waiting for memory.
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2025-03-18 23:52 -0700 |
| Subject | Re: int a = a (Was: Bart's Language) |
| Message-ID | <86zfhhpl5n.fsf@linuxsc.com> |
| In reply to | #391310 |
gazelle@shell.xmission.com (Kenny McCormack) writes: > In article <vrc75b$2r4lt$1@dont-email.me>, > David Brown <david.brown@hesbynett.no> wrote: > ... > >>> gcc won't warn until you say '-Wextra', and then only for: >>> >>> int a = a + 1; >> >> People would not normally write "int a = a;". It is used as a >> common idiom meaning "I know it is not clear to the compiler that >> the variable is always initialised before use, but /I/ know it is - >> so disable the use-without-initialisation warnings for this >> variable". So it makes perfect sense for the compiler not to warn >> about it! An addle-brained view. Anyone who thinks that should be forcibly removed from any activity involving software development. > Wouldn't it just be easier and clearer to write: int a = 0; > and be done with it? There are two problems: one, the semantics are different; and two, the impression given of the author's intent is different. It's kind of like saying "isn't it just easier and clearer to write 'red' rather than 'yellow'?" Writing 'int a = 0;' might be better or it might be worse, depending on one's point of view, but it shouldn't be considered either more clear or less clear, because it isn't saying the same thing.
[toc] | [prev] | [next] | [standalone]
| From | Keith Thompson <Keith.S.Thompson+u@gmail.com> |
|---|---|
| Date | 2025-03-19 01:55 -0700 |
| Subject | Re: int a = a |
| Message-ID | <871putv1qr.fsf@nosuchdomain.example.com> |
| In reply to | #391334 |
Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
[...]
> An addle-brained view. Anyone who thinks that should be forcibly
> removed from any activity involving software development.
Be less rude.
[...]
--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
void Void(void) { Void(); } /* The recursive call of the void */
[toc] | [prev] | [next] | [standalone]
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Date | 2025-04-27 13:41 -0700 |
| Subject | Re: int a = a |
| Message-ID | <86jz758hh4.fsf@linuxsc.com> |
| In reply to | #391338 |
Keith Thompson <Keith.S.Thompson+u@gmail.com> writes: > Tim Rentsch <tr.17687@z991.linuxsc.com> writes: > [...] > >> An addle-brained view. Anyone who thinks that should be forcibly >> removed from any activity involving software development. > > Be less rude. My comment was a statement about content. If I had said "David Brown is an addle-brained fool" that would be a statement about a person. What I did say was not. If you want to disagree with my statements about content, you are welcome to express an opposing view. As long as a statement is about content, rather than about a person, there is nothing wrong with expressing it in any degree of strong language. For the record, there are plenty of behaviors that you engage in that I find offensive, insulting, or rude, including statements made about people. People who live in glass houses shouldn't throw stones. I note with amusement that you over-snipped the context to which I was replying.
[toc] | [prev] | [next] | [standalone]
Page 2 of 4 — ← Prev page 1 [2] 3 4 Next page →
Back to top | Article view | comp.lang.c
csiph-web