Path: csiph.com!news.swapon.de!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Keith Thompson Newsgroups: comp.std.c Subject: Re: Does reading an uninitialized object have undefined behavior? Date: Fri, 21 Jul 2023 14:26:20 -0700 Organization: None to speak of Lines: 59 Message-ID: <87a5vpnegz.fsf@nosuchdomain.example.com> References: <87zg3pq1ym.fsf@nosuchdomain.example.com> <87zg3pnuse.fsf@bsb.me.uk> <874jlxozzz.fsf@nosuchdomain.example.com> <87fs5hnipv.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: dont-email.me; posting-host="2370f913b850030e0527dd0f7396627d"; logging-data="3626045"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/99lDE4wQQxMr4XIKrL/53" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux) Cancel-Lock: sha1:80wg9vI9wUT2SzurX8aEygjmuso= sha1:cNa93JCSNVMoCvBeOD8q8JFCvLw= Xref: csiph.com comp.std.c:6511 Ben Bacarisse writes: > Keith Thompson writes: [...] > There are three relevant clauses in Annex J, and I think we should keep > them all in mind. Sadly, they are not numbered (until C23) so I've > given then 'UB' numbers taken from the similar wording in C23. > > — The value of an object with automatic storage duration is used while > it is indeterminate (6.2.4, 6.7.9, 6.8). [UB-11] > > — A trap representation is read by an lvalue expression that does not > have character type (6.2.6.1). [UB-12] > > — An lvalue designating an object of automatic storage duration that > could have been declared with the register storage class is used in > a context that requires the value of the designated object, but the > object is uninitialized. (6.3.2.1). [UB-20] [...] >> An object with automatic storage duration might be stored in an IA64 >> CPU register. If the object is not initialized, the register's >> NaT bit would be set. Any attempt to read it would cause a trap. >> Writing it would clear the NaT bit. >> >> Which means that a hypothetical CPU with something like a NaT bit >> on each word of memory (iAPX 432? i960?) might cause a trap in >> circumstances not covered by that wording -- but it *is* covered >> by the wording in Annex J. > > It's covered by UB-12 and that's backed up by normative text, > specifically paragraph 5 of the section cited in UB-12. I don't think so. A "non-value representation" (formerly a "trap representation") is determined by the bits making up the representation of an object. For an integer type, such a representation can occur only if the type has padding bits. The IA64 NaT bit is not part of the representation; it's neither a value bit nor a padding bit. For a 64-bit integer type, given CHAR_BIT==8, its *representation* is defined as a set of 8 bytes that can be copied into an object of type `unsigned char[8]`. The NaT bit does not contribute to the size of the object. I think the right way for C to permit NaT-like bits is, as Kaz suggested, to define "indeterminate value" in terms of provenance, not just the bits that make up its current representation. An automatic object with no initialization, or a malloc()ed object, starts with an indeterminate value, and accessing that value (other than as an array of characters) has undefined behavior. (This is a proposal, not what the standard currently says.) IA64 happens to have a way of (partially) representing that provenance in hardware, outside the object in question. Other or future architectures might do a more complete job. [...] -- Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com Will write code for food. void Void(void) { Void(); } /* The recursive call of the void */