Path: csiph.com!goblin3!goblin.stu.neva.ru!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: David Brown Newsgroups: comp.compilers Subject: Re: Bounds checking, Optimization techniques and undefined behavior Date: Tue, 7 May 2019 15:05:23 +0200 Organization: A noiseless patient Spider Lines: 74 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <19-05-047@comp.compilers> References: <19-04-021@comp.compilers> <19-04-023@comp.compilers> <19-04-037@comp.compilers> <19-04-039@comp.compilers> <19-04-042@comp.compilers> <19-04-044@comp.compilers> <19-04-047@comp.compilers> <19-05-004@comp.compilers> <19-05-006@comp.compilers> <19-05-016@comp.compilers> <19-05-020@comp.compilers> <19-05-024@comp.compilers> <19-05-025@comp.compilers> <19-05-028@comp.compilers> <19-05-032@comp.compilers> <19-05-039@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="23197"; mail-complaints-to="abuse@iecc.com" Keywords: C, errors Posted-Date: 07 May 2019 18:40:08 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Content-Language: en-GB Xref: csiph.com comp.compilers:2282 On 06/05/2019 15:40, Andy Walker wrote: > On 06/05/2019 01:15, our esteemed moderator wrote: >> [In the struct { int a,b,c,d; } S example it is my understanding that >> &S and &S.a >> have to be the same, > >     Not "the same", as they have different types;  but yes, they must > compare equal. > No they don't - not in C. They are incompatible pointers, and comparing them is a constraint violation. That means a compiler has to complain about trying to evaluate "&S == &S.a". [My draft of C11 says in section 6.7.2.1: "A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning." -John] If you don't like that behaviour, then maybe standard C is not the It also means that even though the values of the pointers may be the same, using one to access the data of the other is not valid - and the compiler can assume it does not happen. If you don't like that behaviour, then maybe standard C is not the language you want. There are plenty of C compilers that will let you mix and match pointers and accesses despite incompatible types (you still need appropriate casts) - such as using the "-fno-strict-alias" flag in gcc. But you have to actively choose the non-standard C. >>                the four ints have to be in the order declared, > >     Yes, and they must be "sequentially allocated".  Whether that means > the same as "contiguously allocated" [like array members] is not entirely > clear, as the C Standard doesn't define these terms.  Structures can > contain > padding, in general, but [AFAICT] not in a way that affects this debate. There are no restrictions on the padding that can be added between fields in a struct. A conforming compiler /can/ add extra padding, and it can be inconsistent between different parts. So for the four-int struct here, "a" could be at offset 0, "b" at offset 4 (assuming 4-byte int), "c" at offset 12, and "d" at offset 60. Clearly such an arrangement would be highly inefficient and it would take a particularly perverse compiler writer to do anything other than the obvious 4 ints in a row. But the only requirements of C are that the first field is at offset 0, and subsequent fields are at increasing offsets. A conceivable case is that a compiler could add padding in a struct beyond the requirements of alignment if cache line alignment made the results faster. Such padding can't be added in arrays. > At least it seems to be the case that &S.a+1 must compare equal to &S.b, > see N1570, section 6.5.9 para 6, and footnote 109;  but I can't quite make > even this case watertight. > It is 6.5.9p7 that makes this comparison legal. The wording here is a bit odd (no one claims the C standards are always clear), but it means that S.a and S.b can be views as single-element arrays of 1 int. And thus &S.a + 1 is a pointer to just beyond the end of S.a, and will compare equal to &S.b if there is no padding. (There /could/ be padding, but it is quite unlikely.) And even if "&S.a + 1 == &S.b" is true, and "&S.b + 1 == &S.c" is true, evaluating "&S.a + 2 == &S.c" is undefined behaviour. You might not agree that these rules are good in a language, but these are the rules of C.