Path: csiph.com!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Keith Thompson Newsgroups: comp.std.c Subject: Re: Does reading an uninitialized object have undefined behavior? Date: Fri, 21 Jul 2023 11:56:00 -0700 Organization: None to speak of Lines: 109 Message-ID: <874jlxozzz.fsf@nosuchdomain.example.com> References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain Injection-Info: dont-email.me; posting-host="2370f913b850030e0527dd0f7396627d"; logging-data="3568820"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX190DnGRFUDHJYokwy2xZ5u4" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) Cancel-Lock: sha1:EF5yEUeiy2UdIKC6LxdfLFEHwbw= sha1:7roxsQe7taKkdqyaBvW8OXYiPds= Xref: csiph.com comp.std.c:6509 Ben Bacarisse writes: > Keith Thompson writes: >> N3096 is the last public draft of the upcoming C23 standard. >> >> N3096 J.2 says: >> >> The behavior is undefined in the following circumstances: >> [...] >> (11) The value of an object with automatic storage duration is >> used while the object has an indeterminate representation >> (6.2.4, 6.7.10, 6.8). >> >> I'll use an `int` object in my example. >> >> Reading an object that holds a non-value representation has undefined >> behavior, but not all integer types have non-value representations >> -- and if an implementation has certain characteristics, we can >> reliably infer that int has no non-value representations (called >> "trap representations" in C99, C11, and C17). >> >> Consider this program: >> ``` >> #include >> int main(void) { >> int foo; >> if (sizeof (int) == 4 && >> CHAR_BIT == 8 && >> INT_MAX == 2147483647 && >> INT_MIN == -INT_MAX-1) >> { >> int bar = foo; >> } >> } >> ``` >> >> If the condition is true (as it is for many real-world >> implementations), then int has no padding bits and no trap >> representations. The object `foo` has an indeterminate representation >> when it's used to initialize `bar`. Since it cannot have a non-value >> representation, it has an unspecified value. >> >> If J.2(11) is correct, then the use of the value results in undefined >> behavior. >> >> But Annex J is non-normative, and as far as I can tell there is no >> normative text in the standard that says the behavior is undefined. > > 6.3.2.1 p2: > > "[...] If the lvalue designates an object of automatic storage > duration that could have been declared with the register storage class > (never had its address taken), and that object is uninitialized (not > declared with an initializer and no assignment to it has been > performed prior to use), the behavior is undefined." > > seems to cover it. The restriction on not having it's address taken > seems odd. Good find. That sentence was added in C11 (it doesn't appear in C99 or in N1256, which consists of C99 plus the three Technical Corrigenda) in response to DR #338. Since the wording in Annex J goes back to C99 in its current form, and to C90 in a slightly different form, that can't be what Annex J is referring to. And the statement in Annex J is more general, so we can't quite use 6.3.2.1p2 as a retroactive justification. https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_338.htm Yes, that restriction does seem strange. It was inspired by the IA64 (Itanium) architecture, which has an extra trap bit in each CPU register (NaT, "not a thing"). The "could have been declared with the register storage class" wording is there because the IA64 NaT bit exists only in CPU registers, not in memory. An object with automatic storage duration might be stored in an IA64 CPU register. If the object is not initialized, the register's NaT bit would be set. Any attempt to read it would cause a trap. Writing it would clear the NaT bit. Which means that a hypothetical CPU with something like a NaT bit on each word of memory (iAPX 432? i960?) might cause a trap in circumstances not covered by that wording -- but it *is* covered by the wording in Annex J. (Normally, an object whose address is taken can still be stored in a CPU register for part of its lifetime. The effect is to forbid certain optimizations on I64-like systems.) It's tempting to conclude that reading an uninitialized automatic object whose address is taken is *not* undefined behavior (https://en.wikipedia.org/wiki/Exception_that_proves_the_rule), but the standard doesn't say so. C90's Annex G (renamed to Annex J in later editions) says: The behavior in the following circumstances is undefined: [...] - The value of an uninitialized object that has automatic storage duration is used before a value is assigned (6.5.7). 6.5.7 discusses initialization, but doesn't say that reading an uninitialized object has undefined behave, so the issue is an old one. -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com Will write code for food. void Void(void) { Void(); } /* The recursive call of the void */