Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <480-992-1380@kylheku.com> Newsgroups: comp.compilers Subject: Re: Union C++ standard Date: Fri, 26 Nov 2021 18:06:37 -0000 (UTC) Organization: A noiseless patient Spider Lines: 78 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <21-11-006@comp.compilers> References: <21-11-004@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="54566"; mail-complaints-to="abuse@iecc.com" Keywords: C, standards Posted-Date: 26 Nov 2021 13:26:59 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:2753 On 2021-11-25, Hans-Peter Diettrich wrote: > Can somebody explain why the access to members of a union is "undefined" > except for the most recently written member? I don't think that is true; if two members of foo, x and y, have the same type, then it's possible to write to foo.x and then read foo.y. > What can be undefined in a union of data types of the same typesize end > alignment? The representation. Same size and alignment are not sufficient determiners of type. For instance, you may find that int and float are of the same size on the compiler you're using. If the language does not define what it means to access a float object through an int lvalue, that allows aggressive optimizations based on the assumption that type aliasing is absent in the program. For instance suppose that you have struct s { float *pflo; int ival; }; and a (nonsensical example) function working with a struct s *ptr parameter: int fun(struct s *ptr) { ptr->ival++; *ptr->pflo = 0; return ptr->ival; } Under the assumption that objects of different types are not aliased by the program, the compiler can edit code which: 1. reads ptr->ival 2. stores the increment value back into ptr->ival 3. stores 0.0 through *ptr->pflo 4. returns the previously incremented value. Now suppose that aliasing is allowed among any types, like int and float. The compiler has no idea what ptr->pflo points to. The caller could easily have set it like this: ptr->pflo = (float *) &ptr->ival; So if that is allowed, we cannot emit the code like above. We must do this: 1. read ptr->ival 2. store the incremented value back into ptr->ival 3. store 0.0 through *ptr->flo 4. NEW: re-read ptr->ival in case it was changed by 3. 5. return the re-read value. Now that's just one problem. The other is the problem that writing a value as one type and reading as another, if required to be defined in terms of bits or whatever, is going to be entirely nonportable nonetheless. The language standard cannot define it completely to the point that you can rely on the value being the same when the program is ported. At best the standard could say that it's implementation-defined behavior to read through differently-typed union-member. Implementation-defined is basically "almost-undefined, except the situation must be documented by the implementor and cannot blow up". If certain behavior of unions is valuable to the users of a compiler, they can always negotiate that with their compiler vendor; the standard doesn't have to be involved in everything that is defined between the implementor and programmer. -- TXR Programming Language: http://nongnu.org/txr Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal