Path: csiph.com!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Bart <bc@freeuk.com>
Newsgroups: comp.compilers
Subject: Re: Optimization techniques and undefined behavior
Date: Fri, 3 May 2019 00:48:28 +0100
Organization: virginmedia.com
Lines: 95
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <19-05-016@comp.compilers>
References: <19-04-021@comp.compilers> <19-04-023@comp.compilers> <19-04-037@comp.compilers> <19-04-039@comp.compilers> <19-04-042@comp.compilers> <19-04-044@comp.compilers> <19-04-047@comp.compilers> <19-05-004@comp.compilers> <19-05-006@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="8599"; mail-complaints-to="abuse@iecc.com"
Keywords: arithmetic, errors, design, comment
Posted-Date: 02 May 2019 21:39:25 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-GB
Xref: csiph.com comp.compilers:2252

On 02/05/2019 11:29, Andy Walker wrote:
> On 01/05/2019 13:53, Bart wrote:
>> If you have two unknown values A and B, and need to multiply, you won't
>> know if the result will overflow.
>
>     int A := ..., B := ...;
>     int C := ( A = 0 | 0 |: abs B <= maxint % abs A | A*B | error (...);
> 0 );

That sounds unmanageable (think about checks here: a*a+b*(c*d), and how
you might proceed from the error if not aborting) and inefficient if one
multiply turns into half a dozen operations including a divide.

If such overflow checks need to be routinely done, then it really needs
language support. Otherwise it's probably best handled by a guard
function similar to what someone posted:

     if multiply_overflow(a,b) then
          print "Overflow"
     else
          c := a * b        # requires that a and b haven't changed
     end

This at least isolates the actual expression we want and leaves it readable.

>      Of course, in the old days, compilers used to build in range
> checks on array indices and overflow checks on all arithmetic.

Did they? Certainly not any of mine (not really practical on tiny 8-bit
computers on which both the compiler and the program being developed had
to run).

   A few
> still do, esp interpreters, but the God of Speed dictates that most
> languages in most circumstances don't.  We see the results in the huge
> amount of malware that exploits that failure.


I don't know how much that would help. And I think that if a program can
go seriously wrong through unchecked input, then that's a failure in
proper validation. It's rather sloppy to rely on a runtime check put
their by a compiler.

During development, yes, but perhaps not after a program is supposed to
be working.

(My interpreters (attached to apps) did do a number of runtime checks,
but one difference there is that users could write their own programs.
So that constitutes user input and therefore can't be fully trusted.)

>> In the example posed, you have the additional problem that the input can
>> be this:
>>     P5
>>     389000000000000000000000000000 9200000000000000000000000000
>> with both dimensions exceeding int64.
>
>      My own favourite language will throw an "on value error" exception
> if you try to read those values [or any other unsuitable strings] into an
> integer variable.  By default, that will terminate your program with
> suitable error messages/diagnostics, but you can substitute your own
> "on value error" procedure if you want to print a "Don't be daft, please
> type sensible values" message and try again.

My favourite language would actually return those numbers as a type that
doesn't have any meaningful 'intmax' value:

     sreadln(
      "389000000000000000000000000000 9200000000000000000000000000")
     read a, b
     println a, a.type
     println b, b.type

Output is:

     389000000000000000000000000000 <longint>
     9200000000000000000000000000 <longint>

So it gets past that hurdle, but then might be obliged to try creating
an image with 3578800000000000000000000000000000000000000000000000000000
pixels.

(I'm in the process of incorporating such a 'bignum' type into my main
language. It's handy for UI code where performance doesn't matter.)

[There have been plenty of compilers that did bound checking.  Back in
the 1960s and 1970s the WATFOR Fortran compilers, originally for the
7040 and later IBM 360/370 and later PDP-11 did bound checking and
also checks for uninitialized variables.

IBM had two PL/I compilers, the checkout compiler that generated
interpreted code with extensive runtime checks and the optimizing
compiler that generated fast machine code.  It was possible if painful
to compile part of your program with one and part with the other and
link the code together.  Oh, and each compiler ran in 44K bytes of
RAM.  Take that, 8-bit micros. -John]