Path: csiph.com!xmission!usenet.csail.mit.edu!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Subject: Re: Undefined Behavior Optimizations in C
Date: Fri, 06 Jan 2023 07:47:55 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <23-01-014@comp.compilers>
References: <23-01-009@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="29210"; mail-complaints-to="abuse@iecc.com"
Keywords: semantics, optimize
Posted-Date: 06 Jan 2023 11:57:45 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: csiph.com comp.compilers:3283

"Lucian Popescu" <lucic71@ctrl-c.club> writes:
>I'm currently working on an academic project that analyzes the speedup gain of
>Undefined Behavior Optimizations in C.

Some earlier work:

@InProceedings{wang+12,
  author = 	 {Xi Wang and Haogang Chen and Alvin Cheung and Zhihao Jia and Nickolai Zeldovich and M. Frans Kaashoek},
  title = 	 {Undefined Behavior: What Happened to My Code?},
  booktitle =	 {Asia-Pacific Workshop on Systems (APSYS'12)},
  OPTpages = 	 {},
  year =	 {2012},
  url1 =	 {http://homes.cs.washington.edu/~akcheung/getFile.php?file=apsys12.pdf},
  url2 =	 {http://people.csail.mit.edu/nickolai/papers/wang-undef-2012-08-21.pdf},
  OPTannote = 	 {}
}

The most interesting part wrt. topic is section 3.3: They compiled
SPECint 2006 with gcc and clang, with either the default options, or
with defining the behaviour with the options -fno-strict-overflow (or
-fwrapv) -fno-delete-null-pointer-checks -fno-strict-aliasing, which
disables most of what you call "UB Optimizations" in the compiler
versions they used (and probably still disables most "UB
Optimizations"; there have been new -fno-* flags added, corresponding
to some of the new "UB Optimizations", but I expect that the early "UB
Optimizations" have the most impact (you would not start with
low-impact "optimizations").

The results were that in 10 of 12 programs, there was no significant
performance difference.  In 2 of the 12 programs, there were speed
differences by factors 1.072, 1.09, 1.063 and 1.118 (depending on
program and compiler used).

However, they also found that these speed differences are due to a
single place in the source code of each of the two programs, and that
a small change to the source code of each program could eliminate the
speed difference.

@InProceedings{ertl15kps,
  author =       {M. Anton Ertl},
  title =        {What every compiler writer should know about programmers},
  crossref =	 {kps15},
  pages =	 {112--133},
  url =          {http://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf},
  abstract =     {In recent years C compiler writers have taken the
                  attitude that tested and working production (i.e.,
                  \emph{conforming} according to the C standard) C
                  programs are buggy if they contain undefined
                  behavior, and they feel free to compile these
                  programs (except benchmarks) in a way that they no
                  longer work.  The justification for this attitude is
                  that it allows C compilers to optimize better.  But
                  while these ``optimizations'' provide a factor 1.017
                  speedup with Clang-3.1 on SPECint 2006, for
                  non-benchmarks it also has a cost: if we go by the
                  wishes of the compiler maintainers, we have to
                  ``fix'' our working, conforming C programs; this
                  requires a substantial effort, and can result in
                  bigger and slower code.  The effort could otherwise
                  be invested in source-level optimizations by
                  programmers, which can have a much bigger effect
                  (e.g., a factor $>2.7$ for Jon Bentley's traveling
                  salesman program).  Therefore, optimizations that
                  assume that undefined behavior does not exist are a
                  bad idea not just for security, but also for the
                  performance of non-benchmark programs.}
}

@Proceedings{kps15,
  title =        {18. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'15)},
  booktitle =    {18. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'15)},
  year =         {2015},
  key =          {KPS '15},
  editor =       {Jens Knoop and M. Anton Ertl},
  url = 	 {http://www.complang.tuwien.ac.at/kps2015/proceedings/},
  url-pdf = 	 {http://www.complang.tuwien.ac.at/kps2015/proceedings/kps15.pdf}
}

In section 4.2 I use the same way to determine the effectiveness of
"UB Optimizations" (called "optimizations" (with quotes) in the
paper), but with a longer list of flags for gcc-5.2.0:

-fno-aggressive-loop-optimizations -fno-unsafe-loop-optimizations
-fno-delete-null-pointer-checks -fno-strict-aliasing
-fno-strict-overflow -fno-isolate-erroneous-paths-dereference
-fwrapv

I used 8 different versions of Jon Bentley's Traveling Salesman
program translated to C (from Pascal).  I found that for clang-3.5 the
flags made no difference (same binary) for any of the programs, while
for gcc-5.2.0 they made no difference in the binary for three of the
programs and no significant speed difference for two more programs.
For the remaining programs, the "speedups" from "UB Optimizations"
were by a factor of 1.04, 0.95, and 1.02.

It contrasts this with the speedup seen from source-level (manual)
optimizations: The 8 programs were successive steps in manual
optimization.  The speedup seen here was a factor 2.7 when using
gcc-5.2.0 -O3 and more for other compilers.

>I argue that UB Optimizations introduce
>more risks than actual speedup gains.

It's hard to quantify the risks.

My argument has been that it is cheaper and much more effective to
perform manual optimization than to ensure that a C program contains
no undefined behaviour.

>Don't get me wrong, I'm not saying that the compiler should magically
>understand situations where the programmer makes UB mistakes and then
>repair them. What I'm saying is that the compiler should compile the code
>"as is" without making smart optimization transformations that break the
>intention of the programmer.

To which the advocates of "UB Optimizations" react by claiming that
they don't know the intention.  So I have written a followup paper
that discusses this aspect in more depth:

@InProceedings{ertl17kps,
  author =       {M. Anton Ertl},
  title =        {The Intended Meaning of \emph{Undefined Behaviour}
                  in {C} Programs},
  crossref =     {kps17},
  pages =        {20--28},
  url =          {http://www.complang.tuwien.ac.at/papers/ertl17kps.pdf},
  url2 =         {http://www.kps2017.uni-jena.de/proceedings/kps2017_submission_5.pdf},
  abstract =     {All significant C programs contain undefined
                  behaviour. There are conflicting positions about how
                  to deal with that: One position is that all these
                  programs are broken and may be compiled to arbitrary
                  code.  Another position is that tested and working
                  programs should continue to work as intended by the
                  programmer with future versions of the same C
                  compiler. In that context the advocates of the first
                  position often claim that they do not know the
                  intended meaning of a program with undefined
                  behaviour. This paper explores this topic in greater
                  depth. The goal is to preserve the behaviour of
                  existing, tested programs. It is achieved by letting
                  the compiler define a consistent mapping of C
                  operations to machine code; and the compiler then
                  has to stick to this behaviour during optimizations
                  and in future releases.}
}

@Proceedings{kps17,
  title =        {19. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'17)},
  booktitle =    {19. Kolloquium Programmiersprachen und Grundlagen der Programmierung (KPS'17)},
  year =         {2017},
  key =          {KPS '17},
  editor =       {Wolfram Amme and Thomas Heinze},
  url = 	 {http://www.kps2017.uni-jena.de/kps2017_proceedings.html},
  url-pdf = 	 {http://www.kps2017.uni-jena.de/proceedings/kps2017.pdf}
}

>My current progress is here [2]. I did not start the technical work, ATM I only
>have the research proposal. I reached you to see if you have any feedback
>on this proposal.
[[2] https://tildegit.org/lucic71/dissertation/src/branch/master/TSW/tsw.pdf]

I have read Section III and IV of this paper.

Section III: You cite my 2015 paper above.  About SPECInt 2006, I
cited only Wang et al.'s paper (and compute the overall SPECInt
difference based on their numbers).  For the traveling salesman
program, the speedup factor of 2.7 (or more, depending on compiler and
optimization level) is for source-level (i.e., manual) optimization,
not "UB Optimization".  The numbers for "UB Optimization" are, as
mentioned above, usually a factor of 1, and in the three cases where
there is a difference, it's a factor 1.04, 0.95, and 1.02, as
mentioned above.

Section IV: You have the beginnings of a plan to evaluate the
performance; you leave the question of benchmarks open until you know
what changes in the resulting code are due to "UB Optimization".  It's
not clear what the point of such a benchmark selection would be.  It
certainly would not make the selection more representative, and would
have the smell of cherry-picking.  Also, "UB Optimization" may cause
changes in thousands of places in the code of OpenBSD, but only few of
those places will be in hot code and be relevant for performance.

What I miss is a plan to quantify the risks.  Wang et al. made a
pretty good case (as far as I am concerned, but the "UB Optimization"
faithful remain unconvinced) in one of their papers when they analysed
all the Debian packages.

One question is: Given Wang et al.'s earlier work, what is the
original contribution of your intended research?

I discuss some alternative research topics in Section 5.4 of my 2015
paper.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/