Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: David Brown <david.brown@hesbynett.no>
Newsgroups: comp.compilers
Subject: Re: Optimization techniques
Date: Tue, 23 Apr 2019 09:38:42 +0200
Organization: A noiseless patient Spider
Lines: 86
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <19-04-017@comp.compilers>
References: <19-04-004@comp.compilers> <19-04-009@comp.compilers> <19-04-010@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="72782"; mail-complaints-to="abuse@iecc.com"
Keywords: optimize
Posted-Date: 24 Apr 2019 10:19:37 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-GB
Xref: csiph.com comp.compilers:2201

On 19/04/2019 17:48, Rick C. Hodgin wrote:
> On 4/19/2019 4:49 AM, Kaz Kylheku wrote:
>> If you can make your language "C/C++ like" without all the undefined
>> behavior looming at every corner (i.e. not actually "C/C++ like" at all
>> in a significant regard), then you've dodged what is probably the
>> number one
>> optimization pitfall.
>
> I think UB is unavoidable in any C/C++ language.  It is too low-level
> for speed purposes to be bogged down with things that would prevent UB.
> Even something simple like pointer use after free.  It can be difficult
> to catch statically.
>
>> For instance, don't have it so that the order of evaluation of
>> function or operator arguments is unspecified. If you allow side-effects
>> in expressions, specify when those effects take place in relation to
>> everything else in the expression.
>>
>> If you have clear ordering rules, then you honor them when optimizing:
>> you rearrange the user's calculation order only when it can't possibly
>> make a difference to the result that is required by the defined
>> order.
>
> I do have very clear ordering rules.  I'm actually surprised to learn
> that other systems do not.  I would've thought it would be absolutely
> essential.
>

No, it is not.  In C, there is the concept of "sequence points".  This
has got more complicated in C11 with the handling of multiple threads,
but in C99 (and single-threaded C11) these are points which define the
order of operations.  (Just the logical order - the actual generated
code order can vary, according to the "as if" rule.)

But between sequence points, operations can be carried out in any order
that suits the compiler.  Given "x = A + B;", the expressions "A" and
"B" can be evaluated in any order.

If you have this code:

int x = 0;
int nextX(void) {
	x++;
	return x;
}

void foo(void) {
	printf("%i, %i, %i\n", nextX(), nextX(), nextX());
}

then the compile can give "1, 2, 3", or "3, 2, 1", or (less likely, but
legal) other orders such as "1, 3, 2".

(C++ has recently fixed the ordering of evaluation of function
arguments, but that does not apply to C.  And it does not apply to other
orderings between sequence points.)

Of course, a language following the same basic principles as C could
simply say that the comma between arguments in a function call act as
sequence points - then their evaluation order will be fixed.


>> Reordering arithmetic calculations has pitfalls. There are n! orders
>> for adding together n numbers. Under floating-point, these all
>> potentially return different results even if nothing overflows.
>> You can't blindly rely on arithmetic identities to hold.
>
> I think people who use floating point recognize it is not an exact
> numbering system.

You'd be surprised - many who take floating point seriously are looking
for absolutely repeatable results, consistent across different systems.
 There is a difference between knowing the results are not perfect
matches for mathematical "real" numbers, and wanting the floating point
calculations to follow the IEEE rules precisely.  People do take care to
write "a + (b + c)" or "(a + b) + c" because they know that one version
gives more precision in their results.  Compilers can't (usually) get
that same knowledge - they have to trust the programmer.

(Or you can have a "fast fp mode" which allows freer optimisation and
re-arrangements.)

>
> I have introduced native support for arbitrary precision integers
> (bi) and floating point (bfp) to address this, but even then it's
> still limited precision and not exact.