Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #3284
| From | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
|---|---|
| Newsgroups | comp.compilers |
| Subject | Re: Undefined Behavior Optimizations in C |
| Date | 2023-01-06 08:41 +0000 |
| Organization | Institut fuer Computersprachen, Technische Universitaet Wien |
| Message-ID | <23-01-015@comp.compilers> (permalink) |
| References | <23-01-009@comp.compilers> <23-01-011@comp.compilers> <23-01-012@comp.compilers> |
gah4 <gah4@u.washington.edu> writes: >On Thursday, January 5, 2023 at 10:13:08 AM UTC-8, Spiros Bousbouras wrote: >> On 5 Jan 2023 10:05:49 +0000 >> "Lucian Popescu" <luc...@ctrl-c.club> wrote: > >> > I'm currently working on an academic project that analyzes the speedup gain of >> > Undefined Behavior Optimizations in C. >(snip) > >> > To test the theory that the UB Optimizations introduce more risks than >> > speedup gains, > >> Isn't this comparing apples and oranges ? > >Probably. You can compare apples and oranges: How many can you buy for, say, EUR 5? What's their nutritional value? etc. In a similar way, we can compare the advantages and costs of "UB Optimization". The advantages claimed by the faithful are speedups (usually unquantified, to let the the audience imagine high speedups). Wang et al. has done some quantification work, as have I; the results were that the speedups were small, or sometimes even slowdowns. Among the costs are that programmers have to change your program to eliminate standard-undefined behaviour. First you have to find that behaviour. The "UB Optimization" compilers have added "sanitizers" to let you find it; however, sanitizers work at run-time, so they won't find cases where, e.g., the "UB Optimization" "optimized" away a bounds-check preventing a buffer overflow unless there is a test case that tries to perform this buffer overflow. And once you have found the behaviour, you have to change it. This not only costs developers time, it can also cost performance. E.g., in Section 4.3 of <https://www.complang.tuwien.ac.at/kps2015/proceedings/KPS_2015_submission_29.pdf>, I estimate that converting Gforth to C without undefined behaviour would cause a slowdown by a factor >3; Gforth is currently compiled with as few "UB Optimizations" as possible (by using flags like -fwrapv and -fno-... flags). I expect that removing these flags will give a speedup by approximately a factor of 1. For those who believe differently, I have repeatedly posted a challenge <2017Aug18.152854@mips.complang.tuwien.ac.at> |Write a proof-of-concept Forth interpreter in the language you |advocate that runs at least one of bubble-sort, matrix-mult or sieve |from bench/forth in |<http://www.complang.tuwien.ac.at/forth/bench.zip> We can then check whether the language you advocate can compete with a low-level language by benchmarking your interpreter against Gforth. We can compare the advantages to the costs of "UB Optimization" in the currency of development time: How much development time does it cost to achieve as much speedup as the UB optimizations give? How much development time does it cost to add test cases for sanitizing, to run all the different sanitizers, to remove the undefined behaviour that you find, and to add manual optimizations that compensate any performance regressions that may turn up in that process? My expectation is that the latter development cost would often be far greater (and, in the case of the Forth interpreter, I expect that the performance regression compensation is not possible), but it would be nice if we had some empirical results for that. >Most important when debugging, is that you can trust the compiler to >do what you said. That they don't, has always been part of >optimization If a compiler produces something that behaves differently, it's not an optimization. One usually considers the I/O (but not timing) behaviour of the program when evaluating behaviour. When using a single-step debugger, you see execution at a more fine-grained level than the optimization was designed for. One approach is to debug the unoptimized program, and enable optimization afterwards. If it's a proper optimization, the I/O behaviour will be unchanged. >but these UB make it worse. "UB optimizations" assume that programs don't perform undefined behaviour, and they are transformations that are proper optimizations for such programs. The problem is that this assumption is usually wrong. >My thought would be all of SPEC SPEC CPU programs are not representative in many respects: First, they are chosen to be standard-compliant; it's not clear how SPEC tries to ensure this, but as the 464.h264ref function SATD shows, the process is not perfect. Still, a program that uses one of the flags for disabling "UB optimization" is going to be rejected by SPEC, so SPEC programs cannot be used for evaluating all the costs of "UB optimizations". Moreover, the "UB optimization" compilers are tuned for SPEC CPU: They compile the undefined behaviour in SPEC CPU programs (e.g., SATD) as intended, while they miscompile a non-SPEC CPU program with the same undefined behaviour (see gcc bug 66875). And given that the compiler maintainers (and others) evaluate the performance of the compilers using SPEC CPU, the compiler maintainers are motivated to add optimizations that speed up a SPEC CPU program; e.g., if an "UB optimization idea" breaks a test case one one of the tested programs (for gcc those reportedly nowadays include all automatically testable Debian programs, which is good), but benefits a SPEC CPU program, the compiler maintainer will be motivated to refine the criteria for applying the optimizations until the breakage disappears while still applying for the SPEC CPU program; if the "UB optimization" benefits only programs outside the relevant benchmarks, the compiler maintainer will be less likely to spend as much time on it. As a result, I expect to see more benefits from "UB Optimizations" for SPEC CPU than for, say, code that was written after the compiler was released (and was not tuned for the compiler). >I haven't thought about this so recently. Do compilers give warnings >when they remove such UB code? No. The faithful claim that it is impossible, but I don't think so: The compiler could internally compile the program twice: Once with the flags as given, and once with all the flags that the compiler has for defining undefined behaviour (-fwrapv and the various -fno-... options). If there is a difference in the resulting code, produce a warning. That warning could be for something that is a proper optimization for this particular program (i.e., a false positive), or it could be for miscompilation (a true positive); that's up to the developer to sort out. Once we have that framework in place, we can look at the false positives and refine it. E.g., there is a claim that many false negatives come from macro evaluations that are possible at compile time. In that case, one could add something to the source code that suppresses the warning (just like there is "__attribute__((unused))" for suppressing warnings about unused variables. - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
Back to comp.compilers | Previous | Next — Previous in thread | Next in thread | Find similar
Undefined Behavior Optimizations in C "Lucian Popescu" <lucic71@ctrl-c.club> - 2023-01-05 10:05 +0000
RE: Undefined Behavior Optimizations in C "Nuno Lopes" <nuno.lopes@tecnico.ulisboa.pt> - 2023-01-05 10:24 +0000
Re: Undefined Behavior Optimizations in C Spiros Bousbouras <spibou@gmail.com> - 2023-01-05 18:06 +0000
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-05 16:22 -0800
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-06 08:41 +0000
Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-06 16:12 +0100
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-06 10:33 -0800
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-06 11:39 -0800
Re: Undefined Behavior Optimizations in C Spiros Bousbouras <spibou@gmail.com> - 2023-01-07 12:10 +0000
Re: Undefined Behavior Optimizations in C antispam@math.uni.wroc.pl - 2023-01-13 20:46 +0000
Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-09 10:14 +0000
Re: Re: Undefined Behavior Optimizations in C Jon Chesterfield <jonathanchesterfield@gmail.com> - 2023-01-10 10:46 +0000
Re: Undefined Behavior Optimizations in C Thomas Koenig <tkoenig@netcologne.de> - 2023-01-11 09:34 +0000
Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-12 05:21 +0000
Re: Undefined Behavior Optimizations in C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-01-12 12:21 -0800
Re: Undefined Behavior Optimizations in C Thomas Koenig <tkoenig@netcologne.de> - 2023-01-12 21:50 +0000
Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-15 04:17 +0000
Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-11 14:20 +0100
Re: Undefined Behavior Optimizations in C Spiros Bousbouras <spibou@gmail.com> - 2023-01-18 13:14 +0000
Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-18 21:14 +0100
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-18 21:10 -0800
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-20 10:45 -0800
Re: Undefined Behavior Optimizations in C Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2023-01-20 13:54 -0800
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-23 18:50 -0800
Re: Undefined Behavior Optimizations in Fortran "Steven G. Kargl" <sgk@REMOVEtroutmask.apl.washington.edu> - 2023-01-26 21:12 +0000
Re: Undefined Behavior Optimizations in Fortran gah4 <gah4@u.washington.edu> - 2023-01-26 17:50 -0800
Re: Undefined Behavior Optimizations in C "Alexei A. Frounze" <alexfrunews@gmail.com> - 2023-01-19 21:18 -0800
Re: Undefined Behavior Optimizations in C Thomas Koenig <tkoenig@netcologne.de> - 2023-01-20 20:42 +0000
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-21 11:54 +0000
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-22 09:56 +0000
Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-22 07:04 +0000
Re: Undefined Behavior Optimizations in C Martin Ward <martin@gkc.org.uk> - 2023-01-23 17:12 +0000
Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-10 17:32 +0100
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-10 15:57 -0800
Re: Undefined Behavior Optimizations in C David Brown <david.brown@hesbynett.no> - 2023-01-11 14:40 +0100
Re: Undefined Behavior Optimizations in C gah4 <gah4@u.washington.edu> - 2023-01-11 16:09 -0800
Re: Undefined Behavior Optimizations in C dave_thompson_2@comcast.net - 2023-01-28 10:35 -0500
Re: Undefined Behavior Optimizations in C anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2023-01-06 07:47 +0000
Re: Undefined Behavior Optimizations in C Kaz Kylheku <864-117-4973@kylheku.com> - 2023-01-09 09:10 +0000
csiph-web