Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4 Newsgroups: comp.compilers Subject: Re: Undefined Behavior Optimizations in C Date: Wed, 11 Jan 2023 16:09:32 -0800 (PST) Organization: Compilers Central Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <23-01-043@comp.compilers> References: <23-01-009@comp.compilers> <23-01-011@comp.compilers> <23-01-012@comp.compilers> <23-01-017@comp.compilers> <23-01-027@comp.compilers> <23-01-032@comp.compilers> <23-01-035@comp.compilers> <23-01-042@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="7463"; mail-complaints-to="abuse@iecc.com" Keywords: optimize, history Posted-Date: 11 Jan 2023 19:21:25 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <23-01-042@comp.compilers> Xref: csiph.com comp.compilers:3311 On Wednesday, January 11, 2023 at 3:13:08 PM UTC-8, David Brown wrote: (snip) > > Much also assumes ASCII code. Don't tell IBM about that. > There is usually no requirement for a given piece of C code to be fully > portable to all conforming compilers. Most C code is quite limited in > the scope of its use, and it is quite reasonable to assume two's > complement representation (though /not/ two's complement wrapping on > overflow), 8-bit char, ASCII characters, etc. It may be fine to assume > 32-bit int, and perhaps little-endian ordering. These are /warranted/ > assumptions, not unwarranted ones - they are typically reasonable to > make, and if you want to you can sometimes put compile-time checks so > that if someone does try to use it out of context, they get an error > message. As I noted with the Fortran optimization example from 50 years ago, this is not a new problem. But IBM documented it (so, as you note, it is an extension and not UB), and programmers learn to expect it. And yes I ignored the distinction between implementation defined and undefined. But that is exactly what happens when actual people (as opposed to other programs) write programs. People learn what works and what doesn't. When they need to worry about endianness or ASCII or two's complement. Much of the implementation, or undefined, behavior of early Fortran compilers, later got implemented into the standard. And much didn't, but enough people believe it did. If compiler writers remember that they are writing for use by actual people, who are sometimes imperfect, then it should be fine. Try to make the changes not too surprising to users. I don't remember the exact example of surprising UB, but they are out there. In any case, I am all for the original project, of studying compilers, UB, and optimization. If it doesn't really help speed up problems, but does slow down debugging, maybe it isn't worth doing.