Groups | Search | Server Info | Keyboard shortcuts | Login | Register


Groups > comp.compilers > #3330

Re: Undefined Behavior Optimizations in C

Path csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From Spiros Bousbouras <spibou@gmail.com>
Newsgroups comp.compilers
Subject Re: Undefined Behavior Optimizations in C
Date Wed, 18 Jan 2023 13:14:35 -0000 (UTC)
Organization Aioe.org NNTP Server
Sender news@iecc.com
Approved comp.compilers@iecc.com
Message-ID <23-01-062@comp.compilers> (permalink)
References <23-01-027@comp.compilers> <sympa.1673343321.1624.383@lists.iecc.com> <23-01-031@comp.compilers> <23-01-041@comp.compilers>
MIME-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 8bit
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="85677"; mail-complaints-to="abuse@iecc.com"
Keywords C, optimize
Posted-Date 18 Jan 2023 11:35:37 EST
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:3330

Show key headers only | View raw


On Wed, 11 Jan 2023 14:20:49 +0100
David Brown <david.brown@hesbynett.no> wrote:
> C was designed from day one to be a high-level language, not an
> assembler of any sort. Limitations of weaker earlier compilers does
> not mean the language was supposed to work that way.

For those who want an abstract or portable assembler , there exists
c9x.me/compile/ .I've never used it but at least it aims to be that ,
unlike C. I would be curious to know of other analogous projects. I
guess the "register transfer language" of GCC is somewhat analogous.

> I first used a C compiler that optimised on the assumption that UB
> didn't happen some 25 years ago.  (In particular, it assumed signed
> integer arithmetic never overflowed.)

I have encountered several times the claim that compilers assume that UB does
not happen and I don't understand it. Lets consider 2 examples :

    x + 1 > x

in C where  x  is a signed integer. Compilers will often treat this as
always true with the following reasoning :

- if  x  does not have the maximum value which fits in its type then the
  meaning of the C expressions is the same as their mathematical meaning
  so the expression evaluates to true.

- if  x  has the maximum value which fits in its type then  x + 1  is not
  defined so any translation (including treating the whole expression as
  true) is valid.

There's no assumption that UB (undefined behaviour) will not happen, both
possibilities are accounted for.

Another example is

   ... *some_pointer_object ...
   [ some_pointer_object  does not get modified in this part of the code and
     has not been declared as  volatile ]
   if (some_pointer_object == NULL) ...

If  some_pointer_object  is not NULL then the test can be omitted ; if it is
NULL then the earlier dereference is UB so any translation is valid including
omitting the test.

Again, there's no assumpion that UB will not happen.

So the request that C compilers should stop assuming that UB will not
happen seems to me completely misguided. I think what is really meant
is that, in reasoning what a valid translation is, C compilers (or
the authors of the compilers) should not employ the notion of UB. But
then how should UB be translated ? Again there exists the assumption
or claim that there is some intuitively obvious translation and
compilers should go for that. First, I'm not sure that there exists
such a common intuition even among humans and second, even if it does
, how does one go from an intuition to an algorithm C compilers can
use to do translation ? Lots of things are intuitively obvious but
creating an algorithm to duplicate the human intuition is a hard
problem, one which has not been solved in many cases and perhaps even
one which is unsolvable in some cases.

I've seen the suggestion that compilers should describe their behaviour in
terms of assembly generated (possibly some kind of abstract assembly) as
opposed to higher terms. I'm not sure if this is possible and, even if it is,
I would not find it useful. I tend to think of what I want my code to do in
higher terms and then bring it down to the level of the language with
successive refinements. If parts of C were described in assembly terms then
it would potentially force me to do at least 1 more refinement step with no
benefit.

A more productive avenue is for people to give definitions, as precise as
possible, to the kinds of UB which has caused them problems and then try to
convince compiler writers to implement such extensions if they don't do so
already. In this area even compiler documentation should perhaps improve. For
example, from the GCC man page

   -fdelete-null-pointer-checks
       Use global dataflow analysis to identify and eliminate useless
       checks for null pointers.  The compiler assumes that
       dereferencing a null pointer would have halted the program. If
       a pointer is checked after it has already been dereferenced, it
       cannot be null.

       In some environments, this assumption is not true, and programs
       can safely dereference null pointers.  Use
       -fno-delete-null-pointer-checks to disable this optimization for
       programs which depend on that behavior.

.The above still doesn't tell me what is supposed to happen when a NULL pointer
is dereferenced even with the  -fno-delete-null-pointer-checks  flag. I'm
guessing it's impossible to give a general definition. One can in specific
systems but in general no so perhaps the above description does the best
possible.

Another example

   -fstrict-overflow
       Allow the compiler to assume strict signed overflow rules,
       depending on the language being compiled.  For C (and C++) this
       means that overflow when doing arithmetic with signed numbers is
       undefined, which means that the compiler may assume that it will
       not happen.

This is poor phrasing, in particular the part  "which means that the
compiler may assume that it will not happen"  is redundant. There is no
reason for the compiler to assume anything about which execution paths will
happen during runtime to conclude for example that  x + 1 > x   can be
translated as true. The above quote gives an unnecessarily circuitous
reasoning as to why the expression can be translated as true. I give a more
direct reasoning above.

> It annoys /me/ intensely that people complain about this sort of thing,
> and yet apparently haven't bothered to read the compiler manuals to see
> how to get the effects they want.  Compile with "-fno-strict-aliasing",
> or (better, IMHO) add this to your code:
>
> 	#pragma GCC optimize ("-fno-strict-aliasing")
>
> Now, if you want to complain that the gcc documentation is not great,

Yeah, it would be good if there was a more precise specification as to what
additional guarantees beyond the C standard this gives. For translating other
languages into C, this seems to be important for achieving object allocation
and garbage collection since relying on the native  malloc()  and related is
generally not adequate, at least not if your garbage collector is allowed to
move objects.

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Undefined Behavior Optimizations in C "Lucian Popescu" <lucic71@ctrl-c.club> - 2023-01-05 10:05 +0000
  RE: Undefined Behavior Optimizations in C "Nuno Lopes" <nuno.lopes@tecnico.ulisboa.pt> - 2023-01-05 10:24 +0000
  Re: Undefined Behavior Optimizations in C Spiros Bousbouras <spibou@gmail.com> - 2023-01-05 18:06 +0000
    Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-05 16:22 -0800
      Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-06 08:41 +0000
      Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-06 16:12 +0100
        Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-06 10:33 -0800
          Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-06 11:39 -0800
          Re: Undefined Behavior Optimizations in C Spiros Bousbouras <spibou@gmail.com> - 2023-01-07 12:10 +0000
            Re: Undefined Behavior Optimizations in C antispam@math.uni.wroc.pl - 2023-01-13 20:46 +0000
        Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-09 10:14 +0000
          Re: Re: Undefined Behavior Optimizations in C Jon Chesterfield <jonathanchesterfield@gmail.com> - 2023-01-10 10:46 +0000
            Re: Undefined Behavior Optimizations in C Thomas Koenig <tkoenig@netcologne.de> - 2023-01-11 09:34 +0000
              Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-12 05:21 +0000
                Re: Undefined Behavior Optimizations in C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-01-12 12:21 -0800
                Re: Undefined Behavior Optimizations in C Thomas Koenig <tkoenig@netcologne.de> - 2023-01-12 21:50 +0000
                Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-15 04:17 +0000
            Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-11 14:20 +0100
              Re: Undefined Behavior Optimizations in C Spiros Bousbouras <spibou@gmail.com> - 2023-01-18 13:14 +0000
                Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-18 21:14 +0100
                Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-18 21:10 -0800
                Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-20 10:45 -0800
                Re: Undefined Behavior Optimizations in C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-01-20 13:54 -0800
                Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-23 18:50 -0800
                Re: Undefined Behavior Optimizations in Fortran "Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> - 2023-01-26 21:12 +0000
                Re: Undefined Behavior Optimizations in Fortran gah4 <gah4@u.washington.edu> - 2023-01-26 17:50 -0800
                Re: Undefined Behavior Optimizations in C "Alexei A. Frounze" <alexfrunews@gmail.com> - 2023-01-19 21:18 -0800
                Re: Undefined Behavior Optimizations in C Thomas Koenig <tkoenig@netcologne.de> - 2023-01-20 20:42 +0000
                Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-21 11:54 +0000
                Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-22 09:56 +0000
                Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-22 07:04 +0000
                Re: Undefined Behavior Optimizations in C Martin Ward <martin@gkc.org.uk> - 2023-01-23 17:12 +0000
          Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-10 17:32 +0100
            Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-10 15:57 -0800
              Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-11 14:40 +0100
                Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-11 16:09 -0800
            Re: Undefined Behavior Optimizations in C dave_thompson_2@comcast.net - 2023-01-28 10:35 -0500
  Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-06 07:47 +0000
  Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-09 09:10 +0000

csiph-web