Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Bart Newsgroups: comp.compilers Subject: Re: Optimization techniques and undefined behavior Date: Mon, 29 Apr 2019 00:31:40 +0100 Organization: virginmedia.com Lines: 107 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <19-04-039@comp.compilers> References: <72d208c9-169f-155c-5e73-9ca74f78e390@gkc.org.uk> <19-04-021@comp.compilers> <19-04-023@comp.compilers> <19-04-037@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="11585"; mail-complaints-to="abuse@iecc.com" Keywords: optimize, standards, errors Posted-Date: 28 Apr 2019 22:44:43 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <19-04-037@comp.compilers> Content-Language: en-GB Xref: csiph.com comp.compilers:2223 On 28/04/2019 22:49, David Brown wrote: > On 26/04/2019 02:18, Kaz Kylheku wrote: >> Problem is, all these propositions are not safe; they are based on >> "the program has no mistake". > > > /All/ programming is based on the principle of "the program has no > mistake".  It is absurd to single out something like this as though it > is a special case. > > If I want to double a number, and write "x * 3" by mistake, it is a bug. Yes, a bug that will probably give the wrong answer. But I want a predictably wrong answer - see below. >  If I do arithmetic on signed integers, and they overflow, it is a bug. >  The compiler is allowed to assume that when I write "x * 3", I mean > what the C language means - multiply x by 3.  It is also allowed to > assume that when I write some signed arithmetic, I mean what the C > language means - give me the correct results when I give valid inputs, > otherwise it is "garbage in, garbage out". > > Ask yourself, when would "x * 2 / 2" /not/ be equal to 2, given two's > complement wrapping overflows?  It would happen when "x * 2" overflows. >  For example (assuming 32-bit ints), take x = 1,500,000,000.  When you > write "x * 2 / 2" with an optimising C compiler, the result is most > likely 1,500,000,000 (but there are no guarantees). If you write x=(x*2)/2 expecting a result consistent with: int y=(x*2); x=y/2; then you don't want the compiler being clever about overflow. If overflow has occurred, then you want to know about it by it giving a result consistent with twos complement integer overflow. You DON'T want the compiler giving you what looks like the right answer and brushing the overflow under the carpet. You also want a result that is portable across compilers, but I've just tried 20 or so combinations of compilers and optimise flags, all give a result of -647483648 - except gcc which gave 1500000000. And even gcc gave -647483648 with some versions and some optimisation levels. ------------------------- #include int main(void) { int x=1500000000; x=(x*2)/2; printf("%d\n",x); } ------------------------- >  When you use a > "two's complement signed overflow" compiler, you get -647,483,648.  Tell > me, in what world is that a correct answer? It's not a correct answer when you are trying to do pure arithmetic. It CAN be correct when doing it via a computer ALU that uses a 32-bit twos complement binary representation. It is certainly what you might expect on such hardware. > Why do you think a guaranteed wrong and meaningless answer is > better than undefined behaviour? Is it really meaningless? Try the above using x=x*2. It will still overflow and produce a result of -1294967296, apparently incorrect. But print the same bit-pattern using "%u" format, and you get 3000000000 - the right answer. You can predict what's going to happen, /if/ you can predict what a compiler is going to do. Unfortunately with ones like gcc, you can't. > Ask yourself, when would "x + 1 > x" not be true?  When x is INT_MAX and > you have wrapping, x + 1 is INT_MIN.  Ask yourself, is that the clearest > and best way to check for that condition - rather than writing "x == > INT_MAX" ?  When does it make sense to take a large positive integer, > add 1, and get a large /negative/ integer? You get funny things happening with unsigned integers too; try: ------------------- #include int main(void) { unsigned int x=1500000000; x=(x*4)/4; printf("%u\n",x); } ------------------- This displays 426258176 rather than 1500000000. So why is a 'wrong and meaningless answer' OK, in this case, just because C deems it to be defined behaviour? This marginalisation of signed integer overflows is out-dated, now that every relevant machine is going to behave the same way.