Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Bart <bc@freeuk.com>
Newsgroups: comp.compilers
Subject: Re: Optimization techniques and undefined behavior
Date: Mon, 29 Apr 2019 00:31:40 +0100
Organization: virginmedia.com
Lines: 107
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <19-04-039@comp.compilers>
References: <72d208c9-169f-155c-5e73-9ca74f78e390@gkc.org.uk> <19-04-021@comp.compilers> <19-04-023@comp.compilers> <19-04-037@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="11585"; mail-complaints-to="abuse@iecc.com"
Keywords: optimize, standards, errors
Posted-Date: 28 Apr 2019 22:44:43 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <19-04-037@comp.compilers>
Content-Language: en-GB
Xref: csiph.com comp.compilers:2223

On 28/04/2019 22:49, David Brown wrote:
> On 26/04/2019 02:18, Kaz Kylheku wrote:

>> Problem is, all these propositions are not safe; they are based on
>> "the program has no mistake".
>
>
> /All/ programming is based on the principle of "the program has no
> mistake".  It is absurd to single out something like this as though it
> is a special case.
>
> If I want to double a number, and write "x * 3" by mistake, it is a bug.

Yes, a bug that will probably give the wrong answer. But I want a
predictably wrong answer - see below.


>   If I do arithmetic on signed integers, and they overflow, it is a bug.
>   The compiler is allowed to assume that when I write "x * 3", I mean
> what the C language means - multiply x by 3.  It is also allowed to
> assume that when I write some signed arithmetic, I mean what the C
> language means - give me the correct results when I give valid inputs,
> otherwise it is "garbage in, garbage out".
>
> Ask yourself, when would "x * 2 / 2" /not/ be equal to 2, given two's
> complement wrapping overflows?  It would happen when "x * 2" overflows.
>   For example (assuming 32-bit ints), take x = 1,500,000,000.  When you
> write "x * 2 / 2" with an optimising C compiler, the result is most
> likely 1,500,000,000 (but there are no guarantees).

If you write x=(x*2)/2 expecting a result consistent with:

    int y=(x*2); x=y/2;

then you don't want the compiler being clever about overflow. If
overflow has occurred, then you want to know about it by it giving a
result consistent with twos complement integer overflow.

You DON'T want the compiler giving you what looks like the right answer
and brushing the overflow under the carpet.

You also want a result that is portable across compilers, but I've just
tried 20 or so combinations of compilers and optimise flags, all give a
result of -647483648 - except gcc which gave 1500000000. And even gcc
gave -647483648 with some versions and some optimisation levels.

-------------------------
#include <stdio.h>

int main(void) {
   int x=1500000000;

   x=(x*2)/2;

   printf("%d\n",x);

}
-------------------------


>  When you use a
> "two's complement signed overflow" compiler, you get -647,483,648.  Tell
> me, in what world is that a correct answer?

It's not a correct answer when you are trying to do pure arithmetic. It
CAN be correct when doing it via a computer ALU that uses a 32-bit twos
complement binary representation.

It is certainly what you might expect on such hardware.

> Why do you think a guaranteed wrong and meaningless answer is
> better than undefined behaviour?

Is it really meaningless? Try the above using x=x*2. It will still
overflow and produce a result of -1294967296, apparently incorrect. But
print the same bit-pattern using "%u" format, and you get 3000000000 -
the right answer. You can predict what's going to happen, /if/ you can
predict what a compiler is going to do. Unfortunately with ones like
gcc, you can't.

> Ask yourself, when would "x + 1 > x" not be true?  When x is INT_MAX and
> you have wrapping, x + 1 is INT_MIN.  Ask yourself, is that the clearest
> and best way to check for that condition - rather than writing "x ==
> INT_MAX" ?  When does it make sense to take a large positive integer,
> add 1, and get a large /negative/ integer?

You get funny things happening with unsigned integers too; try:

-------------------
   #include <stdio.h>
   int main(void) {
     unsigned int x=1500000000;

     x=(x*4)/4;

     printf("%u\n",x);
   }
-------------------

This displays 426258176 rather than 1500000000.

So why is a 'wrong and meaningless answer' OK, in this case, just
because C deems it to be defined behaviour?

This marginalisation of signed integer overflows is out-dated, now that
every relevant machine is going to behave the same way.