Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2205 > unrolled thread

Re: Optimization techniques

Started byDavid Brown <david.brown@hesbynett.no>
First post2019-04-25 21:58 +0200
Last post2019-05-07 16:38 +0200
Articles 20 on this page of 87 — 14 participants

Back to article view | Back to comp.compilers

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: Optimization techniques David Brown <david.brown@hesbynett.no> - 2019-04-25 21:58 +0200
    Re: Optimization techniques Kaz Kylheku <847-115-0292@kylheku.com> - 2019-04-26 00:18 +0000
      Re: Optimization techniques David Brown <david.brown@hesbynett.no> - 2019-04-28 23:49 +0200
        Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-04-29 00:31 +0100
          Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-29 17:08 +0200
            Re: Optimization techniques and undefined behavior Christian Gollwitzer <auriocus@gmx.de> - 2019-04-29 18:10 +0200
              Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-30 14:46 +0200
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-01 13:53 +0100
                  Re: Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-02 11:29 +0100
                    Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 00:48 +0100
                      Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-03 10:52 +0100
                        Re: Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-04 17:44 -0400
                          Re: Bounds checking, Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-05 17:10 -0400
                        Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-06 08:14 +0200
                        Re: Optimization techniques and undefined behavior Gene Wirchenko <genew@telus.net> - 2019-05-11 22:25 -0700
                      Re: not a lot of memory, was Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 12:45 +0100
                      Re: Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-03 13:29 +0100
                        Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 23:10 +0100
                          Re: Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-04 10:45 +0100
                            Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-05 11:14 +0100
                              Re: Bounds checking, Optimization techniques and undefined behavior Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-05 20:44 +0200
                                Re: Bounds checking, Optimization techniques and undefined behavior Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-06 10:15 +0200
                                  Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 11:04 +0200
                                    Re: Bounds checking, Optimization techniques and undefined behavior "Nuno Lopes" <nuno.lopes@ist.utl.pt> - 2019-05-07 22:38 +0100
                                    Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-08 01:14 +0100
                                      Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 09:55 +0200
                                        Re: Bounds checking, Optimization techniques and undefined behavior "Derek M. Jones" <derek@_NOSPAM_knosof.co.uk> - 2019-05-08 19:08 +0100
                                    Re: Bounds checking, Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-08 01:42 +0100
                                      Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 10:16 +0200
                                        Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-09 01:15 +0100
                                          Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-09 21:56 +0200
                                    Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 10:03 +0200
                                      Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-09 09:19 +0200
                                      Re: Bounds checking, Optimization techniques and undefined behavior Kaz Kylheku <847-115-0292@kylheku.com> - 2019-05-10 03:38 +0000
                                    Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-08 14:37 +0100
                                Re: Bounds checking, Optimization techniques and undefined behavior Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2019-05-06 05:05 -0400
                              Re: Bounds checking, Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-05 17:38 -0400
                                Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-06 13:07 +0100
                                  Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 14:01 +0200
                              Re: Bounds checking, Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-06 01:15 +0100
                                Re: Bounds checking, Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-06 14:40 +0100
                                  Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 15:05 +0200
                                    Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 10:18 +0200
                              Re: Bounds checking, Optimization techniques and undefined behavior Jan Ziak <0xe2.0x9a.0x9b@gmail.com> - 2019-05-06 05:39 -0700
                                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 15:42 +0200
                              Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-06 16:32 +0200
                          Re: Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-04 17:59 -0400
                  Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-02 16:51 +0200
                    Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-02 20:04 +0100
                      Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-03 17:23 +0200
                        Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 21:10 +0100
                        Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-06 13:25 +0100
                          Re: Optimization techniques and undefined behavior "Derek M. Jones" <derek@_NOSPAM_knosof.co.uk> - 2019-05-06 16:32 +0100
                          Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:03 +0200
                            Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-08 13:16 +0100
                            Re: Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-08 15:13 -0400
                        Re: Optimization techniques and undefined behavior "Robin Vowels" <robin51@dodo.com.au> - 2019-05-07 01:22 +1000
                          Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:05 +0200
                    Re: Optimization techniques and undefined behavior Christian Gollwitzer <auriocus@gmx.de> - 2019-05-02 22:22 +0200
            Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-04-29 18:15 +0100
              Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-30 15:48 +0200
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-01 12:40 +0100
                  Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-02 17:27 +0200
                    Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-02 18:59 +0100
                      Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:16 +0200
                Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-02 14:54 +0100
        Re: Optimization techniques and runtime checks Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-04-29 22:36 +0200
          Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-07 16:29 +0200
            Re: Optimization techniques and runtime checks Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-08 02:27 +0200
              Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-08 10:31 +0200
                Re: Optimization techniques and runtime checks Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-08 22:50 +0200
                Re: Optimization techniques and runtime checks "Robin Vowels" <robin51@dodo.com.au> - 2019-05-11 19:26 +1000
                Re: Optimization techniques and runtime checks Gene Wirchenko <genew@telus.net> - 2019-05-11 22:43 -0700
                  Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-12 20:17 +0200
            Re: Optimization techniques and runtime checks Bart <bc@freeuk.com> - 2019-05-08 14:58 +0100
              Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-08 23:02 +0200
                Re: Optimization techniques and runtime checks Bart <bc@freeuk.com> - 2019-05-09 18:28 +0100
                  Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-09 22:07 +0200
        Re: Optimization techniques Gene Wirchenko <genew@telus.net> - 2019-04-30 18:24 -0700
          Re: Optimization techniques David Brown <david.brown@hesbynett.no> - 2019-05-01 09:20 +0200
            Re: Optimization techniques Kaz Kylheku <847-115-0292@kylheku.com> - 2019-05-02 17:40 +0000
            Re: Optimization techniques and error detection Gene Wirchenko <genew@telus.net> - 2019-05-03 10:16 -0700
            Re: Optimization techniques "Robin Vowels" <robin51@dodo.com.au> - 2019-05-07 01:42 +1000
    Re: Optimization techniques Kaz Kylheku <847-115-0292@kylheku.com> - 2019-04-26 02:26 +0000
      Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-29 00:12 +0200
        Re: Optimization techniques and undefined behavior Kaz Kylheku <847-115-0292@kylheku.com> - 2019-05-02 17:18 +0000
          Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:38 +0200

Page 4 of 5 — ← Prev page 1 2 3 [4] 5  Next page →


#2233 — Re: Optimization techniques and undefined behavior

FromDavid Brown <david.brown@hesbynett.no>
Date2019-04-30 15:48 +0200
SubjectRe: Optimization techniques and undefined behavior
Message-ID<19-04-049@comp.compilers>
In reply to#2229
On 29/04/2019 19:15, Bart wrote:
> On 29/04/2019 16:08, David Brown wrote:
>> On 29/04/2019 01:31, Bart wrote:
>
>>> then you don't want the compiler being clever about overflow.
>>
>> I /do/ want a result consistent with a single expression, or splitting
>> up the expression.
>
> Then the choice is between both ways giving you 1500000000, or both
> giving you -647483648.

Let me repeat - I do not care what the results are here.  I don't care
if they are consistent with each other.  I don't care if they change
between runs of the compiler.  I don't care if the result is a pink
umbrella.

I care what I get from "x * 2 / 2" when I give a sensible, valid value
of x.  The range of valid values depends on the type of x, of course.

> The former is going to be difficult, since the intermediate 32-bit value
> has lost some information. The latter is very easy, and involves dumping
> the UB nonsense.


Giving "don't care" results is even easier.

>> Questions about what the compiler will do with overflows, like how
>> consistent it will be, are as sensible as asking how many miles per
>> gallon you get from your car when it has no tires.  You would not drive
>> your car without tires - that would be a mistake, a bug in your driving
>> procedure.  I don't write signed integer expressions that overflow -
>> barring bugs in my coding.  And thus I don't care what the compiler does
>> about them, and have no interest in their consistency.
>
> If the gcc people designed cars, either the car wouldn't have an engine
> because, since you're always going to end up at your start point,
> there's no point in driving it; or it wouldn't have any brakes since you
> are never going to have an accident.

No, if the gcc people designed cars they would work according to
specification.  So if they designed a car for a maximum of 5 people,
they would only include 5 seat belts.  Anyone exceeding those
specifications does so on their own responsibility - the car
manufacturer says nothing about what will happen if you try to squeeze
in a sixth person.  Typical results will be "best effort", but it could
also mean the car won't move, or that it will be more dangerous in an
accident.

But if the "signed integers should wrap" crowd built a car for five
people, it would come with a guarantee that squeezing in an extra person
would leave the car with 6 people-shaped holes.  But hey, it's
guaranteed behaviour, and that's clearly more important than common sense.

>
>> I want the compiler to give me the right answer to valid questions - I
>> don't expect it to give me any consistent answer to invalid questions.
>
> What is the question? Hint: it's not the result of 1500000000*2/2, it's
> the result of 1500000000*2/2 when the 1500000000 is represented as a
> 32-bit twos complement binary value, and intermediate calculations are
> done to the same precision.
>

That is not the question - not in C, nor in normal mathematics, nor
common simple arithmetic, nor in modelling almost any situation in the
real world that you might be handling in code.  If that /were/ the
question, then it would not be meaningless and it would have a single
correct answer.  But it is not the question.

>>> but I've just
>>> tried 20 or so combinations of compilers and optimise flags, all give a
>>> result of -647483648 - except gcc which gave 1500000000. And even gcc
>>> gave -647483648 with some versions and some optimisation levels.
>>>
>>
>> Do you understand what "optimising compiler" means?  It means the
>> compiler should try to give you code that runs as efficiently as
>> possible given valid inputs.  C does not impose any requirements on code
>> efficiency, but compiler users certainly do - so a C compiler is not
>> going to go out of its way to give poorer quality code.  So given "x * 2
>> / 2;", a compiler will do one of two things - return "x" unchanged, or
>> carry out the operations using the most obvious assembly instructions.
>> A good compiler will thus give you 1500000000 in this case, as that is
>> the most efficient implementation consistent with the source code.
>
> And so it will be inconsistent with (in my tests) most other compilers.
> My tests were done both with optimisation and without. clang-O3,
> gcc81-O3, gcc81-03, and gcc51-O3 gave 1500000000.
>
> All other compilers I tried, including VC, clang-O0 and gcc51-O0, gave
> the -647483648 figure. As would my own compilers for other languages (if
> using int32, but they now use int64 and the same behaviour is observed
> when x is 6000000000000000000).
>

I have used a number of other compilers that don't consider signed
overflow to be defined behaviour.  And I have used many more that
/sometimes/ do optimisations that depend on it being undefined.  This
is, of course, perfectly reasonable - since it is undefined, there is
nothing hindering it from being wrapping on occasion.  This makes a
mockery of your tests here - just because some compilers sometimes give
a result that implies they have done two's complement wrapping, does not
mean they /always/ do so - unless they specifically say so in their
documentation.

As for your optimisation flags - here is a hint about C programming.  If
optimisation flags affect the results of your code, rather than just the
timing, your code is wrong.

>> It is not a correct answer for standard C signed arithmetic, because
>> there is no correct answer.
>
> This is the nub of the issue: *C* has decided that such arithmetic is
> undefined.

I'm sorry, I thought this discussion was about C.  If the C standards
say this is how integer arithmetic works in C, then that is how signed
integer arithmetic works in C.  It doesn't matter how it works in D, E
or F - those don't affect how it works in C.

> But this is exactly the same 32-bit operation that can be
> done in a dozen other languages, probably on most machines that support
> 32-bit multiply, and most do not make it undefined.
>
> So it is largely a peculiarity of C.
>

Every language has its peculiarities.  That might make them different,
it does not make them wrong.

For many languages, overflowing integers throws some kind of error or
exception - it is /not/ defined as modulo arithmetic.  This is much like
C, except that in C you are expected to make checks for errors yourself,
rather than rely on the language to do it - this principle is a major
reason why C is used when top efficiency is required.  It is also a
reason why C is not a good choice of language for the careless or lazy
programmer, the amateur programmer, the wilfully ignorant programmer who
refuses to learn the language, or for the programmer who wants to get a
lot done with a little development work.  (I have always held that most
C programmers should not be programming in C - and most programs written
in C should not be written in C.)

>
>    It is not a correct answer in normal
>> mathematics, or almost any real-world problem you might want to model.
>> It is, however, correct if you have defined your signed arithmetic to be
>> wrapping.  It is fine - but IMHO almost entirely useless and
>> counter-productive - for a programming language to define signed
>> arithmetic in that way.  C does not define it that way, but other
>> languages (and particular C compilers) can do so.
>
> This is contradictory - so a C compiler can choose to make something
> that C as deemed undefined behaviour, defined?
>

Yes.  How long do you claim to have been working with C ?

When you compile a C program, the behaviour for any part of the code can
be defined by the C standards (whichever one you use), the
implementation, and additional standards or information for the system,
OS, ABI, libraries, APIs, etc., in use.  The fact that the C standards
leave signed integer overflow undefined behaviour means that an
implementation can treat it as undefined - and assume it won't happen,
or that the programmer doesn't care what happens if the UB /does/ occur.
  But it does not in any way limit the C compiler from giving its own
definition - whether that be to wrap results, throw up an error message,
or launch nasal demons.

>>> It is certainly what you might expect on such hardware.
>>>
>>>> Why do you think a guaranteed wrong and meaningless answer is
>>>> better than undefined behaviour?
>>>
>>> Is it really meaningless? Try the above using x=x*2. It will still
>>> overflow and produce a result of -1294967296, apparently incorrect. But
>>> print the same bit-pattern using "%u" format, and you get 3000000000 -
>>> the right answer. You can predict what's going to happen, /if/ you can
>>> predict what a compiler is going to do. Unfortunately with ones like
>>> gcc, you can't.
>>
>> Again, it does not matter if you can predict what value you get if
>> something is nonsensical.  It matters that you can predict what values
>> are valid inputs, of course, but not what the outputs are when the
>> inputs are invalid.  When garbage goes in, I don't care what colour of
>> garbage comes out - if I care what is going to come out, I am careful
>> about what I put in.
>
> Sorry, but to me a result of 1500000000 would be garbage, as it is
> highly misleading. If I didn't intend 1500000000*2/2 to overflow, but
> the result was a perfect 1500000000, how would I know there was a bug?

Sure, 1500000000 is garbage - you've put garbage in, and get garbage
out.  But /any/ result would be garbage.  I'd prefer the compiler to
give the implementation that gave me correct outputs for correct inputs,
rather than wasting cycles to give me rainbow coloured garbage for
inputs I would never use because they are invalid.

[toc] | [prev] | [next] | [standalone]


#2239 — Re: Optimization techniques and undefined behavior

FromBart <bc@freeuk.com>
Date2019-05-01 12:40 +0100
SubjectRe: Optimization techniques and undefined behavior
Message-ID<19-05-003@comp.compilers>
In reply to#2233
On 30/04/2019 14:48, David Brown wrote:
> On 29/04/2019 19:15, Bart wrote:
>> On 29/04/2019 16:08, David Brown wrote:
>>> On 29/04/2019 01:31, Bart wrote:
>>
>>>> then you don't want the compiler being clever about overflow.
>>>
>>> I /do/ want a result consistent with a single expression, or splitting
>>> up the expression.
>>
>> Then the choice is between both ways giving you 1500000000, or both
>> giving you -647483648.
>
> Let me repeat - I do not care what the results are here.  I don't care
> if they are consistent with each other.  I don't care if they change
> between runs of the compiler.  I don't care if the result is a pink
> umbrella.

So, there's a bug in the program, an inadvertent overflow. But rather
than help in discovering that bug (such as giving the wrong results)
gcc (and its clone Clang) conveniently pretend that such bugs cannot
exist, and use that to give their code an unfair advantage:

     #include <stdio.h>

     int main(void) {
         int x,y,z;

         x=1000000000;
         z=0;

         for (int i=0; i<1000000000; ++i) {
             y=x*3/3;
             z+=y;
             ++x;
         }

         printf("%d\n",x);
         printf("%d\n",y);
         printf("%d\n",z);
     }

Optimising compilers that don't take advantage of that undefined
behaviour give timings here of 1.25 seconds (msvc) and 1.9 seconds
(pelles c), with some taking much longer.

The two that do, give timings of 0.22 seconds (gcc) and 0.05 seconds
(clang).

However, there are a couple of problems: (1) they give different
results; (2) they cheated by not doing a billion multiplies and divides.

Note this example also includes UB due to z+=y line, but I'm only
interested in the bottom 32 bits, albeit signed, as a kind of checksum
to compare with other compilers.

For that purpose, I need the final z value to be -101627306, which will
match the same 32-bit arithmetic across languages** and in assembly.

I don't need it to be the mathematically correct 1499999999500000000,
which seems to me what you'd like it to be. gcc/clang-O3 give me
1565039360 which is neither one nor the other.

-------------------------------

(** This is the above program auto-translated from C to one of my
languages, which is sort of interesting, this being a compiler group.

Normally this is just for to help in viewing torturous C source code as
the C semantics are not translated. But tweaked with the int32() cast to
match C's intermediate calculations (usually 64-bit here), this actually
works:

     import clib

     global function main()int32 =
         var int32 x
         var int32 y
         var int32 z
         var int32 i

         x := 1000000000
         z := 0
         i := 0
         while i<1000000000 do
             y := int32(x*3)/3
             z +:= y
             ++x
             ++i
         od
         printf("%d\n",x)
         printf("%d\n",y)
         printf("%d\n",z)
         return 0
     end

     proc start =
         stop main()
     end

The results match those of the non-gcc/non-clang C compilers (apart from
speed which is poor).)

[toc] | [prev] | [next] | [standalone]


#2245 — Re: Optimization techniques and undefined behavior

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-02 17:27 +0200
SubjectRe: Optimization techniques and undefined behavior
Message-ID<19-05-009@comp.compilers>
In reply to#2239
On 01/05/2019 13:40, Bart wrote:
> On 30/04/2019 14:48, David Brown wrote:
>> On 29/04/2019 19:15, Bart wrote:
>>> On 29/04/2019 16:08, David Brown wrote:
>>>> On 29/04/2019 01:31, Bart wrote:
>>>
>>>>> then you don't want the compiler being clever about overflow.
>>>>
>>>> I /do/ want a result consistent with a single expression, or splitting
>>>> up the expression.
>>>
>>> Then the choice is between both ways giving you 1500000000, or both
>>> giving you -647483648.
>>
>> Let me repeat - I do not care what the results are here.  I don't care
>> if they are consistent with each other.  I don't care if they change
>> between runs of the compiler.  I don't care if the result is a pink
>> umbrella.
>
> So, there's a bug in the program, an inadvertent overflow. But rather
> than help in discovering that bug (such as giving the wrong results)
> gcc (and its clone Clang) conveniently pretend that such bugs cannot
> exist, and use that to give their code an unfair advantage:
>
>      #include <stdio.h>
>
>      int main(void) {
>          int x,y,z;
>
>          x=1000000000;
>          z=0;
>
>          for (int i=0; i<1000000000; ++i) {
>              y=x*3/3;
>              z+=y;
>              ++x;
>          }
>
>          printf("%d\n",x);
>          printf("%d\n",y);
>          printf("%d\n",z);
>      }
>
> Optimising compilers that don't take advantage of that undefined
> behaviour give timings here of 1.25 seconds (msvc) and 1.9 seconds
> (pelles c), with some taking much longer.
>
> The two that do, give timings of 0.22 seconds (gcc) and 0.05 seconds
> (clang).
>

Why are you worrying about timings of code with a bug?  If you have a
bug, you want the tools to try to help find the problem - and making it
an illegal operation rather than a legal nonsense operation will allow
tools to help.  Nobody cares how fast buggy code runs.

> However, there are a couple of problems: (1) they give different
> results; (2) they cheated by not doing a billion multiplies and divides.
>

It doesn't matter whether code is defined behaviour or not here - a good
compiler will do at least some of these calculations at compile time.  A
/really/ good compiler - better than gcc - would do it /all/ at compile
time.  Then it would give specific results for x, y and z in -fwrapv
mode, and a warning about overflow in normal UB mode.

And again, who cares about different results in different circumstances
from buggy code?


> Note this example also includes UB due to z+=y line, but I'm only
> interested in the bottom 32 bits, albeit signed, as a kind of checksum
> to compare with other compilers.
>

Then why not write correct code to do that?

> For that purpose, I need the final z value to be -101627306, which will
> match the same 32-bit arithmetic across languages** and in assembly.
>

It won't match many real languages - we have already seen that handling
of signed integer overflow varies a lot, with only a few doing two's
complement wrapping.

> I don't need it to be the mathematically correct 1499999999500000000,
> which seems to me what you'd like it to be. gcc/clang-O3 give me
> 1565039360 which is neither one nor the other.
>

I would not like it to be the mathematically correct answer - because
there is no defined correct answer for this C code.  If a particular C
implementation defines signed integer overflow in some way, then I
expect the answer to be consistent with that definition - if it does
not, then I don't care what the result is.

> -------------------------------
>
> (** This is the above program auto-translated from C to one of my
> languages, which is sort of interesting, this being a compiler group.
>

Sure, your language is absolutely on-topic here.

But if you want to translate your language to C, you need to translate
it to C - not to what you think C ought to be.  Given that you want your
own language to have wrapping semantics on integer overflow (hey, it's
your language - you define it in a crazy way if you want), then you
can't translate the source expressions into exactly the same thing in C.
  You need to write C code that means the same as the original.

If you didn't have such a knee-jerk allergy to the C preprocessor, I
could show you an efficient and safe way to do that.


If you are translating from C to your own language, then you can do it
directly - it is perfectly allowable to implement the undefined
behaviour by any definition that takes your fancy.

What does not make sense, of course, is to run tests in C with undefined
behaviour and expect consistent or testable results.  That is just daft.


> Normally this is just for to help in viewing torturous C source code as
> the C semantics are not translated. But tweaked with the int32() cast to
> match C's intermediate calculations (usually 64-bit here), this actually
> works:
>
>      import clib
>
>      global function main()int32 =
>          var int32 x
>          var int32 y
>          var int32 z
>          var int32 i
>
>          x := 1000000000
>          z := 0
>          i := 0
>          while i<1000000000 do
>              y := int32(x*3)/3
>              z +:= y
>              ++x
>              ++i
>          od
>          printf("%d\n",x)
>          printf("%d\n",y)
>          printf("%d\n",z)
>          return 0
>      end
>
>      proc start =
>          stop main()
>      end
>
> The results match those of the non-gcc/non-clang C compilers (apart from
> speed which is poor).)

Who cares?  The C code is buggy, so the results don't matter.

[toc] | [prev] | [next] | [standalone]


#2249 — Re: Optimization techniques and undefined behavior

FromBart <bc@freeuk.com>
Date2019-05-02 18:59 +0100
SubjectRe: Optimization techniques and undefined behavior
Message-ID<19-05-013@comp.compilers>
In reply to#2245
On 02/05/2019 16:27, David Brown wrote:
> On 01/05/2019 13:40, Bart wrote:

>> (** This is the above program auto-translated from C to one of my
>> languages, which is sort of interesting, this being a compiler group.
>>
>
> Sure, your language is absolutely on-topic here.

Don't forget this a compiler newsgroup not a C one, so examples of my
language are just as relevant as C.

'Of interest' was a source-to-source translator that removes unnecessary
macros, typedefs, #ifs and #includes from C source code.

> But if you want to translate your language to C, you need to translate
> it to C - not to what you think C ought to be.

Read again - the translation goes the other way.

   Given that you want your
> own language to have wrapping semantics on integer overflow (hey, it's
> your language - you define it in a crazy way if you want),

You mean, like most processors, most languages and even most C
compilers? Then yeah it's crazy.

> What does not make sense, of course, is to run tests in C with undefined
> behaviour and expect consistent or testable results.  That is just daft.

That was sort of the point of my post - it's daft is for C to declare
such code undefined behaviour, when the general consensus (see above) is
that it can be perfectly well defined.


>> The results match those of the non-gcc/non-clang C compilers (apart from
>> speed which is poor).)
>
> Who cares?  The C code is buggy, so the results don't matter.

The C is only buggy because C says so. My version is exactly the same
program, and is not buggy because this language doesn't make unsigned
overflow undefined behaviour.

[toc] | [prev] | [next] | [standalone]


#2286 — Re: Optimization techniques and undefined behavior

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-07 16:16 +0200
SubjectRe: Optimization techniques and undefined behavior
Message-ID<19-05-051@comp.compilers>
In reply to#2249
On 02/05/2019 19:59, Bart wrote:
> On 02/05/2019 16:27, David Brown wrote:
>> On 01/05/2019 13:40, Bart wrote:

> 'Of interest' was a source-to-source translator that removes unnecessary
> macros, typedefs, #ifs and #includes from C source code.
>
>> But if you want to translate your language to C, you need to translate
>> it to C - not to what you think C ought to be.
>
> Read again - the translation goes the other way.
>

OK, I got that wrong.  When you are translating /from/ C, you can assume
any behaviour you like for things that are undefined behaviour.

I can't understand how you would see the removal of typedefs, macros,
etc., to be an advantage - it would make most C code harder to
understand.  But if you find your tool useful, great.

>   Given that you want your
>> own language to have wrapping semantics on integer overflow (hey, it's
>> your language - you define it in a crazy way if you want),
>
> You mean, like most processors, most languages and even most C
> compilers? Then yeah it's crazy.
>

Yes, crazy for a language.  I think it is hard to define "most
languages", but a great many either throw a run-time error or use
expanded range numbers - both of which are appropriate (though there is
a cost).  Several, not just C, assume that overflow doesn't happen as it
is undefined behaviour.  There are some languages that define it as
two's complement wrapping, and that's crazy.

I've explained before why /processors/ use two's complement wrapping -
it is not because it gives a useful answer.

>> What does not make sense, of course, is to run tests in C with undefined
>> behaviour and expect consistent or testable results.  That is just daft.
>
> That was sort of the point of my post - it's daft is for C to declare
> such code undefined behaviour, when the general consensus (see above) is
> that it can be perfectly well defined.
>

Proof by repeated assertion is not valid.

>
>>> The results match those of the non-gcc/non-clang C compilers (apart from
>>> speed which is poor).)
>>
>> Who cares?  The C code is buggy, so the results don't matter.
>
> The C is only buggy because C says so.

The C standards define the c language - if they say the C code is buggy,
the C code is buggy.  I can't believe you need this explained.

>  My version is exactly the same
> program, and is not buggy because this language doesn't make unsigned
> overflow undefined behaviour.

(I assume you meant "signed" here.)

You are welcome to have an extended C language that defines the
behaviour of signed overflow - and the code is not buggy in that
language.  Some C compilers specifically give that guarantee too (such
as gcc with "-fwrapv"), and the code is not buggy for them.  For other C
compilers that mostly do two's complement wrapping but don't document
it, and might behave differently when optimising some types of code,
it's a game of chance - that means the code is buggy.

[toc] | [prev] | [next] | [standalone]


#2243 — Re: Optimization techniques and undefined behavior

FromMartin Ward <martin@gkc.org.uk>
Date2019-05-02 14:54 +0100
SubjectRe: Optimization techniques and undefined behavior
Message-ID<19-05-007@comp.compilers>
In reply to#2233
On 01/05/19 12:40, Bart wrote:
>         for (int i=0; i<1000000000; ++i) {
>             y=x*3/3;
>             z+=y;
>             ++x;
>         }

Given that gcc can optimise y=x*3/3 to y=x
I would have expected the whole loop to be recognised
as computing the sum of an arithmetic sequence
and therefore optimised away completely.

This blog post:

https://xebia.com/blog/gcc-compiler-optimizations-dissection-of-a-benchmark/

discusses a program with the loop:

for( a = 0; a < arg1; a++ )
{
   sum += a;
}

and claims that gcc with -O3 will optimise
the loop to take only costant time.

But this is not the case on my machine with gcc 4.9.2

--
			Martin

Dr Martin Ward | Email: martin@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

[toc] | [prev] | [next] | [standalone]


#2230 — Re: Optimization techniques and runtime checks

FromHans-Peter Diettrich <DrDiettrich1@netscape.net>
Date2019-04-29 22:36 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-04-046@comp.compilers>
In reply to#2221
Am 28.04.2019 um 23:49 schrieb David Brown:
> On 26/04/2019 02:18, Kaz Kylheku wrote:

> Instead of the compiler generated faster code for valid use,
> and debug tools being able to spot your error, you have slower code with
> a guaranteed error and no way for tools to help you.

The Delphi compiler can do both, create fast or checked code, from the
same source. Automated checks for simple errors like overflows and range
errors can be turned on or off, and range errors can be checked for
strings and all array types. It's a matter of data types that allow for
such automated checks. Data types also allow to perform many checks at
compile time, which can be made in other languages only at runtime and
possibly only if the user adds the checker code.


> My preference is to write code in a clear and sensible manner, and have
> the compiler generate as efficient object code as it can from that
> source.

I construct my Delphi data types in a way that the compiler can create
the most possible checks, in case they are required later.


> Assembly is intended to be a low-level language.  Some assemblers do a
> little bit of optimisation, but mostly it is a language designed to give
> the programmer precise control - and full responsibility - over their
> code.  C is a high level language - it is not an assembler.  It is
> defined in terms of its semantics, not the implementation.

To me C looks like a macro assembler, with only a few weak types and rules.


>> In the C language, the translated units of a program that are to be
>> linked to form the program have undergone semantic analysis.
>>
>> So no further optimization is permissible; it constitutes semantic
>> analysis.  Translation unit boundaries should be regarded as inviolable
>> optimization barriers.
>>
>
> Nonsense.  There is no basis for that at all - excluding unwarranted
> assumptions by many programmers.

A better compiler could have a look at multiple (dependent) modules, so
that it can apply some more global optimizations during compilation already.


>> Well-defined conditions can be wrong in the given program.
>> E.g. "I'd like to trap when a value greater than 42 is written into the
>> variable x, even though it's well-defined to do so."

Again that's a matter of data types, where Delphi allows for integral
subrange types with defined upper and lower bounds. This allows for
compile time tests with known values, and for runtime tests otherwise.


> This is exactly the same as mathematics.  The "square root" function,
> for real numbers, is defined for non-negative numbers.  If someone asks
> you for "sqrt(-4)", then the question is meaningless.  Any answer given
> can be considered incorrect - equally, any answer given can be
> considered correct.

That's a cheap excuse for poor language design, aimed at sloppy compiler
implementation.


> Incorrect inputs to code are as wrong as incorrect code.  In C, the
> behaviour of signed integer addition is defined for all inputs that
> don't overflow - giving it inputs that /do/ overflow is as wrong as
> writing multiplication when you meant addition, or passing a negative
> number to a square root function.

For such cases many languages have assertions or can raise exceptions,
which can be handled by the user as appropriate (SEH).


To me the C and most C-ish languages are poorly designed and not a good
base for discussing valid optimization techniques. Bart said essentially
the same without a Delphi reference. That's why I did not contribute to
that C language specific optimization, but added my $0.02 on how
thoughtful language design can allow for better optimizations.

DoDi

[toc] | [prev] | [next] | [standalone]


#2287 — Re: Optimization techniques and runtime checks

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-07 16:29 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-052@comp.compilers>
In reply to#2230
On 29/04/2019 22:36, Hans-Peter Diettrich wrote:
> Am 28.04.2019 um 23:49 schrieb David Brown:
>> On 26/04/2019 02:18, Kaz Kylheku wrote:
>
>> Instead of the compiler generated faster code for valid use,
>> and debug tools being able to spot your error, you have slower code with
>> a guaranteed error and no way for tools to help you.
>
> The Delphi compiler can do both, create fast or checked code, from the
> same source. Automated checks for simple errors like overflows and range
> errors can be turned on or off, and range errors can be checked for
> strings and all array types. It's a matter of data types that allow for
> such automated checks. Data types also allow to perform many checks at
> compile time, which can be made in other languages only at runtime and
> possibly only if the user adds the checker code.
>

Yes.  I haven't used Delphi for a fair number of years, but that has
been a feature of Borland's compilers since I first used Turbo Pascal.

gcc and clang have similar features for C.  It is helpful to use such
run-time checks during testing and debugging, and to be able to omit
unnecessary inefficiencies when speed or size is more important.

>
>> My preference is to write code in a clear and sensible manner, and have
>> the compiler generate as efficient object code as it can from that
>> source.
>
> I construct my Delphi data types in a way that the compiler can create
> the most possible checks, in case they are required later.
>
>
>> Assembly is intended to be a low-level language.  Some assemblers do a
>> little bit of optimisation, but mostly it is a language designed to give
>> the programmer precise control - and full responsibility - over their
>> code.  C is a high level language - it is not an assembler.  It is
>> defined in terms of its semantics, not the implementation.
>
> To me C looks like a macro assembler, with only a few weak types and rules.
>

C does not have the range of type creation facilities that Pascal (and
Object Pascal) has.  C++ provides a good deal more, but still lacks
several of Pascal's capabilities here.

>
>>> In the C language, the translated units of a program that are to be
>>> linked to form the program have undergone semantic analysis.
>>>
>>> So no further optimization is permissible; it constitutes semantic
>>> analysis.  Translation unit boundaries should be regarded as inviolable
>>> optimization barriers.
>>>
>>
>> Nonsense.  There is no basis for that at all - excluding unwarranted
>> assumptions by many programmers.
>
> A better compiler could have a look at multiple (dependent) modules, so
> that it can apply some more global optimizations during compilation
> already.

Indeed.  And better compilers do that - it is known as "link time
optimisation", "whole program optimisation", "omniscient optimisation",
"inter-module optimisation", etc.

>
>
>>> Well-defined conditions can be wrong in the given program.
>>> E.g. "I'd like to trap when a value greater than 42 is written into the
>>> variable x, even though it's well-defined to do so."
>
> Again that's a matter of data types, where Delphi allows for integral
> subrange types with defined upper and lower bounds. This allows for
> compile time tests with known values, and for runtime tests otherwise.
>

Yes - Pascal has these, C does not.  (And if you want them in C++, you
need to make them.)

>
>> This is exactly the same as mathematics.  The "square root" function,
>> for real numbers, is defined for non-negative numbers.  If someone asks
>> you for "sqrt(-4)", then the question is meaningless.  Any answer given
>> can be considered incorrect - equally, any answer given can be
>> considered correct.
>
> That's a cheap excuse for poor language design, aimed at sloppy compiler
> implementation.
>

You are saying that mathematics is a poor language design?

>
>> Incorrect inputs to code are as wrong as incorrect code.  In C, the
>> behaviour of signed integer addition is defined for all inputs that
>> don't overflow - giving it inputs that /do/ overflow is as wrong as
>> writing multiplication when you meant addition, or passing a negative
>> number to a square root function.
>
> For such cases many languages have assertions or can raise exceptions,
> which can be handled by the user as appropriate (SEH).
>

Many languages have that, and it can be a very useful feature for many
kinds of program.  Other languages don't have it (as an automatic
feature), because such features are not free.

>
> To me the C and most C-ish languages are poorly designed and not a good
> base for discussing valid optimization techniques. Bart said essentially
> the same without a Delphi reference. That's why I did not contribute to
> that C language specific optimization, but added my $0.02 on how
> thoughtful language design can allow for better optimizations.
>

Different languages have different purposes and trade-offs.  This is
important to accept when discussing them.

C is useful to /me/, for the programming /I/ do.  It is useful for quite
a lot of other programming.  But it is only useful if you understand the
language - if people insist on pretending it is something that it is
not, they will have trouble.  If you want run-time checks on buffer
accesses, signed integer arithmetic, etc., that's fine - find a language
that suits your needs, or at least is the best compromise of your needs.

Every language has its oddities, and there are always things that some
people thing are terrible design.  For example, it is truly weird that
"const x : int = 10;" is one of Delphi's way of making "x" a variable
with an initialised value.  (In older Delphi, it was the only way!)

[toc] | [prev] | [next] | [standalone]


#2294 — Re: Optimization techniques and runtime checks

FromHans-Peter Diettrich <DrDiettrich1@netscape.net>
Date2019-05-08 02:27 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-059@comp.compilers>
In reply to#2287
Am 07.05.2019 um 16:29 schrieb David Brown:
> On 29/04/2019 22:36, Hans-Peter Diettrich wrote:
>> Am 28.04.2019 um 23:49 schrieb David Brown:

>>> This is exactly the same as mathematics.  The "square root" function,
>>> for real numbers, is defined for non-negative numbers.  If someone asks
>>> you for "sqrt(-4)", then the question is meaningless.  Any answer given
>>> can be considered incorrect - equally, any answer given can be
>>> considered correct.
>>
>> That's a cheap excuse for poor language design, aimed at sloppy compiler
>> implementation.
>>
>
> You are saying that mathematics is a poor language design?

No, I criticize the lack of restricted (non-negative...) types for such
purposes.


> Different languages have different purposes and trade-offs.  This is
> important to accept when discussing them.

No doubt :-)

So I wonder why the topic (optimization...) drifted into a discussion of
almost only the C language.

I also wonder where runtime efficiency nowadays is really so important,
that safety and security breaches are considered acceptable?


> Every language has its oddities, and there are always things that some
> people thing are terrible design.  For example, it is truly weird that
> "const x : int = 10;" is one of Delphi's way of making "x" a variable
> with an initialised value.  (In older Delphi, it was the only way!)

This makes the different evolution visible:
Pascal started as a tutorial language, C as a production language.
Evolution added features of great demand - in the case of Pascal many
features had to be added to Wirth's academic language(s) before it
became usable outside education.

I definitely prefer a safe language, which can be tuned later for
efficieny. Lowering requirements and checks in specific parts of a
program is doable and acceptable easier than introducing requirements
that were not anticipated in the basic language design.

DoDi
[These days I write just about everything in python.  Occasionally
I'll rewrite something in C if python can't do it but that's pretty
rare.  The python implementation everyone uses is written in C but it
does a pretty good job of managing storage and arithmetic. -John]

[toc] | [prev] | [next] | [standalone]


#2299 — Re: Optimization techniques and runtime checks

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-08 10:31 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-064@comp.compilers>
In reply to#2294
On 08/05/2019 02:27, Hans-Peter Diettrich wrote:
> Am 07.05.2019 um 16:29 schrieb David Brown:
>> On 29/04/2019 22:36, Hans-Peter Diettrich wrote:
>>> Am 28.04.2019 um 23:49 schrieb David Brown:
>
>>>> This is exactly the same as mathematics.  The "square root" function,
>>>> for real numbers, is defined for non-negative numbers.  If someone asks
>>>> you for "sqrt(-4)", then the question is meaningless.  Any answer given
>>>> can be considered incorrect - equally, any answer given can be
>>>> considered correct.
>>>
>>> That's a cheap excuse for poor language design, aimed at sloppy compiler
>>> implementation.
>>>
>>
>> You are saying that mathematics is a poor language design?
>
> No, I criticize the lack of restricted (non-negative...) types for such
> purposes.

OK.  Again, I'd note that different languages have different design
requirements and different uses.  (Although I agree that such specific
ranged types are nice, and are a feature of Pascal that I miss in C.)

>
>
>> Different languages have different purposes and trade-offs.  This is
>> important to accept when discussing them.
>
> No doubt :-)
>
> So I wonder why the topic (optimization...) drifted into a discussion of
> almost only the C language.
>

C is a language that many people know, one for which optimisations are
very important, and where certain kinds of optimisations can be
controversial - and therefore lead to discussions.  If everyone is happy
with a language's definitions, behaviours, and typical optimisations,
there'd be no arguments and few posts!

But by all means, add more about Pascal or any other language you like
to this thread - it could be interesting for many of us.

> I also wonder where runtime efficiency nowadays is really so important,
> that safety and security breaches are considered acceptable?
>

This depends on the use of the code.

I have many times noted that C is often a poor choice of language.
There was a time when it was the best, or only, language suitable for a
wide range of uses.  This is no longer the case.  I personally use it
for small-systems embedded programming, and the runtime efficiency /is/
important.  But for PC programming, I use mostly Python (I used to use
Delphi more).  Pick a language that makes sense for the job, and gives
the trade-offs that suit your needs.


And often there is no way to handle run-time errors sensibly anyway.
You don't want your car brakes to give you a message "Your braking
system has encountered an integer overflow.  Please report this error to
your car dealer".  You want the brake software developers to be
/absolutely/ sure that overflows can't happen - and then there is no
point in run-time checks.


C is not going away.  But it would be good to see less C code used, but
more care used with the C code that is written.  That includes testing
with tools that do run-time checks, but not having them included in the
final product.  It is the run-time efficiency of the underlying C code
in libraries, VMs, etc., that makes the speed of most checked languages
acceptable.

>
>> Every language has its oddities, and there are always things that some
>> people thing are terrible design.  For example, it is truly weird that
>> "const x : int = 10;" is one of Delphi's way of making "x" a variable
>> with an initialised value.  (In older Delphi, it was the only way!)
>
> This makes the different evolution visible:
> Pascal started as a tutorial language, C as a production language.
> Evolution added features of great demand - in the case of Pascal many
> features had to be added to Wirth's academic language(s) before it
> became usable outside education.
>

Yes.  I realise that this oddity is "for historical reasons".  The same
applies to a great many oddities in C.

> I definitely prefer a safe language, which can be tuned later for
> efficieny. Lowering requirements and checks in specific parts of a
> program is doable and acceptable easier than introducing requirements
> that were not anticipated in the basic language design.
>
> DoDi
> [These days I write just about everything in python.  Occasionally
> I'll rewrite something in C if python can't do it but that's pretty
> rare.  The python implementation everyone uses is written in C but it
> does a pretty good job of managing storage and arithmetic. -John]

That's the way to do it.

[toc] | [prev] | [next] | [standalone]


#2306 — Re: Optimization techniques and runtime checks

FromHans-Peter Diettrich <DrDiettrich1@netscape.net>
Date2019-05-08 22:50 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-071@comp.compilers>
In reply to#2299
Am 08.05.2019 um 10:31 schrieb David Brown:

> And often there is no way to handle run-time errors sensibly anyway.
> You don't want your car brakes to give you a message "Your braking
> system has encountered an integer overflow.  Please report this error to
> your car dealer".  You want the brake software developers to be
> /absolutely/ sure that overflows can't happen - and then there is no
> point in run-time checks.

Most probably no user will ever have a chance to report above error :-]

I found SEH quite nice for handling errors. In detail with the Java
feature that the compiler can complain about unhandled exceptions.

DoDi

[toc] | [prev] | [next] | [standalone]


#2314 — Re: Optimization techniques and runtime checks

From"Robin Vowels" <robin51@dodo.com.au>
Date2019-05-11 19:26 +1000
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-079@comp.compilers>
In reply to#2299
From: "David Brown" <david.brown@hesbynett.no>
Sent: Wednesday, May 08, 2019 6:31 PM


> I have many times noted that C is often a poor choice of language.
> There was a time when it was the best, or only, language suitable for a
> wide range of uses.

That not so. C was never the best (ior only) language.

PL/I and Ada are probably in that category.
PLI has been used for system programming as well as general
purpose programming.

>  This is no longer the case.  I personally use it
> for small-systems embedded programming, and the runtime efficiency /is/
> important.  But for PC programming, I use mostly Python (I used to use
> Delphi more).  Pick a language that makes sense for the job, and gives
> the trade-offs that suit your needs.
>
> And often there is no way to handle run-time errors sensibly anyway.
> You don't want your car brakes to give you a message "Your braking
> system has encountered an integer overflow.  Please report this error to
> your car dealer".  You want the brake software developers to be
> /absolutely/ sure that overflows can't happen - and then there is no
> point in run-time checks.

Wishful thinking.  While the code in a program can avoid a hardware overflow
(through appropriate programming), in the event that an overflow occurs,
the program still needs to tell the outside world that an overflow occurred.
[I think you will find that the PL/I that people have used for system programming
is pretty stripped down, like Intel's PL/M and IBM's PL/S. -John]

[toc] | [prev] | [next] | [standalone]


#2316 — Re: Optimization techniques and runtime checks

FromGene Wirchenko <genew@telus.net>
Date2019-05-11 22:43 -0700
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-081@comp.compilers>
In reply to#2299
On Wed, 8 May 2019 10:31:25 +0200, David Brown
<david.brown@hesbynett.no> wrote:

[snip]

>And often there is no way to handle run-time errors sensibly anyway.

     Rule 1 of handling run-time errors sensibily:

          DD  EEE TTT EEE  CC TTT    TTT H H EEE M M !
          D D E    T  E   C    T      T  H H E   MMM !
          D D EE   T  EE  C    T      T  HHH EE  M M !
          D D E    T  E   C    T      T  H H E   M M
          DD  EEE  T  EEE  CC  T      T  H H EEE M M !

>You don't want your car brakes to give you a message "Your braking
>system has encountered an integer overflow.  Please report this error to
>your car dealer".  You want the brake software developers to be
>/absolutely/ sure that overflows can't happen - and then there is no
>point in run-time checks.

     I have read too many stories about "This should never happen."
conditions happening.

>Most probably no user will ever have a chance to report above error :-]

     Maybe not.  Try a Web search -- I use duckduckgo.com myself --
for:
          honda brakes problem

I do not know what the issue was.  Ahem!  I do not know what the
issues were.  Apparently, there was a problem in 2009 or so and just
now.

[snip]

>Yes.  I realise that this oddity is "for historical reasons".  The same
>applies to a great many oddities in C.

     "Those who do not know history are condemned to repeat it." Those
who are stuck with unrevised standards are similarly condemned.

[snip]

Sincerely,

Gene Wirchenko
[This is defininitely far from compilers.  The recent error is that a
system that's supposed to brake to avoid collisions somtimes brakes at
random in heavy traffic.  Sounds like a bug but not one that could
have been optimized away. -John]

[toc] | [prev] | [next] | [standalone]


#2317 — Re: Optimization techniques and runtime checks

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-12 20:17 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-082@comp.compilers>
In reply to#2316
On 12/05/2019 07:43, Gene Wirchenko wrote:
> On Wed, 8 May 2019 10:31:25 +0200, David Brown
> <david.brown@hesbynett.no> wrote:
>
> [snip]
>
>> And often there is no way to handle run-time errors sensibly anyway.
>
>       Rule 1 of handling run-time errors sensibily:
>
>            DD  EEE TTT EEE  CC TTT    TTT H H EEE M M !
>            D D E    T  E   C    T      T  H H E   MMM !
>            D D EE   T  EE  C    T      T  HHH EE  M M !
>            D D E    T  E   C    T      T  H H E   M M
>            DD  EEE  T  EEE  CC  T      T  H H EEE M M !
>

There are several points for handling run-time errors sensibly, and
/all/ of them need to be in place - so there is no "rule number 1".  You
need to figure out what kinds of errors can occur, how you can detect
them, and what you can do when you detect them.  It is utterly pointless
to start thinking about how to detect errors before figuring out what
those errors might be!

>> You don't want your car brakes to give you a message "Your braking
>> system has encountered an integer overflow.  Please report this error to
>> your car dealer".  You want the brake software developers to be
>> /absolutely/ sure that overflows can't happen - and then there is no
>> point in run-time checks.
>
>       I have read too many stories about "This should never happen."
> conditions happening.

And I have seen too many examples of "just to be sure" run-time error
checking that is never tested properly (how can you test something when
the triggering circumstances can't happen?) and turns out to have bugs
that cause trouble later.

>
>> Most probably no user will ever have a chance to report above error :-]
>
>       Maybe not.  Try a Web search -- I use duckduckgo.com myself --
> for:
>            honda brakes problem
>
> I do not know what the issue was.  Ahem!  I do not know what the
> issues were.  Apparently, there was a problem in 2009 or so and just
> now.

I know brake software has had bugs.  (I find it hard to comprehend, but
I know it is true.)  Equally, I know that showing an error message on
the car's screen for a software error is a useless idea.  (Show messages
when hardware, sensors, drivers, etc., have failed.)

>
> [snip]
>
>> Yes.  I realise that this oddity is "for historical reasons".  The same
>> applies to a great many oddities in C.
>
>       "Those who do not know history are condemned to repeat it." Those
> who are stuck with unrevised standards are similarly condemned.
>

As are those who think detecting run-time errors is a substitute for
writing code that does not generate these errors in the first place.

[Back to compilers, please, unless we have some insight into how
a compiler might address these problems. -John]

[toc] | [prev] | [next] | [standalone]


#2303 — Re: Optimization techniques and runtime checks

FromBart <bc@freeuk.com>
Date2019-05-08 14:58 +0100
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-068@comp.compilers>
In reply to#2287
On 07/05/2019 15:29, David Brown wrote:
> On 29/04/2019 22:36, Hans-Peter Diettrich wrote:

>> A better compiler could have a look at multiple (dependent) modules, so
>> that it can apply some more global optimizations during compilation
>> already.
>
> Indeed.  And better compilers do that - it is known as "link time
> optimisation", "whole program optimisation", "omniscient optimisation",
> "inter-module optimisation", etc.

My recent compilers (not for C; the language doesn't really allow it)
have all been whole-program** compilers.

Although they don't really do optimisation (it's a separate project I
might get around to), they would be ideally placed as all the source
code is available for all functions at the same time.

There is one global, hierarchical symbol table linking all functions,
variables, types etc. across the project.

So anyway, thanks for confirming that mine might be one of the better
compilers! Usually you are not so kind.


(** The 'whole-program' is defined as all the source files that are
built into a single binary executable, or single shared library.

Functions in external libraries, which tend to be in another language
anyway, are not included in the source visible to the compiler. Such
libraries are also used only as shared libraries (eg. dll) so are never
statically linked.)

[toc] | [prev] | [next] | [standalone]


#2307 — Re: Optimization techniques and runtime checks

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-08 23:02 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-072@comp.compilers>
In reply to#2303
On 08/05/2019 15:58, Bart wrote:
> On 07/05/2019 15:29, David Brown wrote:
>> On 29/04/2019 22:36, Hans-Peter Diettrich wrote:
>
>>> A better compiler could have a look at multiple (dependent) modules, so
>>> that it can apply some more global optimizations during compilation
>>> already.
>>
>> Indeed.  And better compilers do that - it is known as "link time
>> optimisation", "whole program optimisation", "omniscient optimisation",
>> "inter-module optimisation", etc.
>
> My recent compilers (not for C; the language doesn't really allow it)
> have all been whole-program** compilers.
>
> Although they don't really do optimisation (it's a separate project I
> might get around to), they would be ideally placed as all the source
> code is available for all functions at the same time.
>
> There is one global, hierarchical symbol table linking all functions,
> variables, types etc. across the project.
>
> So anyway, thanks for confirming that mine might be one of the better
> compilers! Usually you are not so kind.
>

If your compiler can only be run as a whole-program compiler, then I
might not be so kind, and complain about the lack of scalability.  But I
appreciate that your compiler is designed with certain use-cases in
mind, and if it does the job you want it to do, then that's great.

Still, being whole-program gives it the potential to have more
optimisations and static error analysis than many other tools.

(I have always been impressed that you have made a compiler at all - it
is not an easy job.  I have been critical that you have placed so much
emphasis on things that I see as barely relevant, such as the size and
speed of the compiler, rather than on correctly supporting the C
language and its features.  I have also been critical of a number of
aspects of your own languages and their design, and highly critical of
your claims and comparisons to other languages and tools.  But again,
despite that, I am greatly impressed that you have made these in the
first place.  If you had said "look, I've made my own language and my
own tools that I find useful", I'd praise your work.  It is only because
you say "look, I've made my own language that is vastly superior to all
other languages, and my own tools that beat the pants off every other
tool ever written" that I criticise your work.)

[toc] | [prev] | [next] | [standalone]


#2310 — Re: Optimization techniques and runtime checks

FromBart <bc@freeuk.com>
Date2019-05-09 18:28 +0100
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-075@comp.compilers>
In reply to#2307
On 08/05/2019 22:02, David Brown wrote:
> On 08/05/2019 15:58, Bart wrote:

>> My recent compilers (not for C; the language doesn't really allow it)
>> have all been whole-program** compilers.

> If your compiler can only be run as a whole-program compiler, then I
> might not be so kind, and complain about the lack of scalability.

Scalability is something I do keep in mind. At the moment a project
totalling 100,000 lines of code might take half a second to build (to
..exe), using a particular version of my non-optimised, non-optimising
compiler on my specific machine, which is not fast.

Bigger programs can be organised into libraries, then each library is
independently compiled, and the main application is kept small.

I'm also developing a new embedded scripting language, which is only
compiled on demand (like JIT although JIT now means something a little
different), and which can also 'mop-up' a lot of code that would
otherwise be in the main application.

But my current project is about 30Kloc. It takes 0.2 seconds to build
with a previous compiler, and 0.25 seconds with itself (it's a bit more
elaborate, but it's its code generator that seriously needs a peep-hole
optimiser). Neither have any conventional optimisation.

(C is simpler to compile and with fewer passes is faster. Building
SQLITE3 from two files (shell.c and the amalgamated sqlite3.c, totalling
250Kloc), takes 0.45 seconds (optimised compiler this time otherwise
0.6), on the same machine.)

> (I have always been impressed that you have made a compiler at all - it
> is not an easy job.  I have been critical that you have placed so much
> emphasis on things that I see as barely relevant, such as the size and
> speed of the compiler, rather than on correctly supporting the C
> language and its features.

The speed is what makes a whole-project compiler viable.
[For embedded applications there are systems where the object code
that the compiler generates is really the intermediate code from the
first pass of the compiler, and the linker reads everything in
including the libraries and does the optimization and code generation
on the whole thing.  Dunno how fast it is but it is my impression that
all of the code in the chips in your car was built that way. -John]

[toc] | [prev] | [next] | [standalone]


#2312 — Re: Optimization techniques and runtime checks

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-09 22:07 +0200
SubjectRe: Optimization techniques and runtime checks
Message-ID<19-05-077@comp.compilers>
In reply to#2310
> [For embedded applications there are systems where the object code
> that the compiler generates is really the intermediate code from the
> first pass of the compiler, and the linker reads everything in
> including the libraries and does the optimization and code generation
> on the whole thing.  Dunno how fast it is but it is my impression that
> all of the code in the chips in your car was built that way. -John]

There is a kernel of truth in there, but you are mixing a few things
together.

For some small microcontrollers, especially those with "brain-dead"
processors like 8051 cpus, many toolchains do "whole program
optimisation".  For chips like these, you don't have a decent data stack
(there are not "sp + offset" addressing modes), and ram is often in
banks.  To get efficient code, the compiler needs to try to fit local
variables and parameters into static ram addresses, re-using them when
possible, and consider code flow patterns when deciding which banks to
use for different types of data.  This is not done by a compiler and
linker as you describe, but having a single program that takes all the C
code from all the files, and treats it as a single unit (roughly
speaking, you rename every file-static function and variable, adding the
filename to the identifier, then make almost everything "static").
There are not separate compile and link stages.


For larger processors (of which there are many in modern cars) using
modern compilers, you can have "link time optimisation".  Here each unit
is handled by the compiler to produce a intermediary code in internal
formats - but it is not just from the first pass.  This code is
analysed, warnings are generated when possible, and a large amount of
optimisation is carried out while still in the internal formats.  The
linker then combines these (from many files), and passes the combination
back to the compiler for further processing.  For scalability, this can
be done in repeated steps with different groups of files, rather than
all at once - when building Firefox or LibreOffice, you want to take
advantage of your 64-core build system.
[Sounds reasonable.  I was thinking of the ARM tools which I suppose do
more than a tyrpical first pass before handing stuff off to the linker. -John]

[toc] | [prev] | [next] | [standalone]


#2236

FromGene Wirchenko <genew@telus.net>
Date2019-04-30 18:24 -0700
Message-ID<19-04-052@comp.compilers>
In reply to#2221
On Sun, 28 Apr 2019 23:49:53 +0200, David Brown
<david.brown@hesbynett.no> wrote:

[snip]

>If you are writing your code in a "C with the extra feature of having
>defined behaviour on signed integer overflow", and only compile it with
>suitable compilers (or compiler flags), then that's okay.  But don't
>call it correct C code and blame compilers for your own mistakes or
>unwarranted assumptions.

     I would like to see it as part of the language.  I *know* that I
want to have an error be thrown at run-time if an error can be
detected.  (It is not an unwarranted assumption.)  It is not as if
detecting signed integer overflow is a difficult thing on, for
example, System/370, which also dates from 1970.

     I am fine with compiler options allowing each of us to have our
respective ways.  I am tired of the default being "Overflow happens;
too bad".  That is why I refuse to use C.  It is too dangerous for my
taste.

Sincerely,

Gene Wirchenko
[The overflow trap was in S/360 in 1963.  It ain't new. -John]

[toc] | [prev] | [next] | [standalone]


#2238

FromDavid Brown <david.brown@hesbynett.no>
Date2019-05-01 09:20 +0200
Message-ID<19-05-002@comp.compilers>
In reply to#2236
On 01/05/2019 03:24, Gene Wirchenko wrote:
> On Sun, 28 Apr 2019 23:49:53 +0200, David Brown
> <david.brown@hesbynett.no> wrote:
>
> [snip]
>
>> If you are writing your code in a "C with the extra feature of having
>> defined behaviour on signed integer overflow", and only compile it with
>> suitable compilers (or compiler flags), then that's okay.  But don't
>> call it correct C code and blame compilers for your own mistakes or
>> unwarranted assumptions.
>
>       I would like to see it as part of the language.  I *know* that I
> want to have an error be thrown at run-time if an error can be
> detected.  (It is not an unwarranted assumption.)  It is not as if
> detecting signed integer overflow is a difficult thing on, for
> example, System/370, which also dates from 1970.

Detecting signed overflow at run-time can be a significant cost.  It
ruins expression manipulation, optimisation, and simplification - much
more than making signed overflow be two's complement.  Even compared to
a dumb translation compilation of expressions, it nearly doubles the
size of the code on many processors as you have to follow each
arithmetic instruction with a "jump if overflow".  (Some cpus have
"trapping" arithmetic instructions, but certainly not all.)  On small
processors, this is all a heavy penalty on performance.  On big
processors, simple instructions are often "free" while the cpu is
waiting for memory reads, but these sequences thrash your branch
prediction buffers.

No, throwing an error on overflows is not hard - but it /is/ costly.  It
can be a marvellous tool during testing and debugging, and may be worth
leaving active in some programs, but it has a price.

However, if by "part of the language" you are thinking more of optional
or configurable possibilities - perhaps like the standard pragmas for
controlling some floating point details, then I like that idea.
Standard pragmas letting people choose signed overflow behaviour, from
the default of "undefined behaviour" to trapping/signalling, two's
complement wrapping, and perhaps saturation, would be nice.

>
>       I am fine with compiler options allowing each of us to have our
> respective ways.  I am tired of the default being "Overflow happens;
> too bad".  That is why I refuse to use C.  It is too dangerous for my
> taste.
>

Choice is a great thing.  I don't use C on PC's - not because it is
dangerous as such, but because it takes too much effort to use it safely
and well.  But I use it on small microcontrollers, where I am willing to
put in the effort to make the code correct and efficient.  On PC's, I
mostly use Python - then I don't have to concern myself about overflows
(or many other details that the language handles), and I am willing to
pay the efficiency price involved.

[toc] | [prev] | [next] | [standalone]


Page 4 of 5 — ← Prev page 1 2 3 [4] 5  Next page →

Back to top | Article view | comp.compilers


csiph-web