Path: csiph.com!xmission!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: antispam@fricas.org Newsgroups: comp.compilers Subject: Re: Undefined behaviour in C23 Date: Sat, 23 Aug 2025 15:45:03 -0000 Organization: Compilers Central Sender: johnl%iecc.com Approved: comp.compilers@iecc.com Message-ID: <25-08-013@comp.compilers> References: <25-08-006@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="22288"; mail-complaints-to="abuse@iecc.com" Keywords: C, standards Posted-Date: 23 Aug 2025 15:03:13 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3687 Martin Ward wrote: > On 20/08/2025 14:06, John wrote: > >> When a language is 50 years old and there is a mountain of legacy code that > >> they really don't want to break, it accumulates a lot of cruft. If we were >> starting now we'd get something more like Go. >> >> On the other hand, there's the python approach in which they deprecate and >> remove little used and crufty features, but old python code doesn't work any >> more unless you go back and update it every year or two. -John] > > Legacy behaviour and undefined behaviour are orthogonal concepts. > > > Any language ought to have fully defined behaviour: even if the > definition is simply "syntax error" or "the program exits with a > suitable error message". A language can be extended with new > constructs and new functions: then the behaviour changes from "syntax > error" or "error message" to the new functionality. > > Whether or not old behaviour is preserved is a completely > separate issue. It is rather strange to see you as source of such requirement. There are both theoretical obstacles and practical ones. Each language effectively defines theory of computations and it is well-known that interesting theories are incomplete, that is do not define everything. On practical side, error detection has its costs and some problems like aliasing are so hard to detect that runtime detection is inpractical. Trying to define things is of little use if defined operations do "wrong" things (we learned this lesson from PL/I). Problems are most acute for langages that aim at high runtime efficiency, like C. For example, using gcc as a backend Pascal 'for' loop which correctly handled both edge cases run at half speed of similar C loop which ignored edge cases. AFAIK all lower level languages have some amount of deliberately undefined behaviour. Pascal has a lot. Ada less, but still it is there. From my point of view the most promising approach to "fully defined" language is to require proof that no problematic situation can occur during execution (and reject programs with faulty proofs). But that is too demanding from programmers. Even if programmer informally can prove that program is correct, providing machine checkable proof is a lot of work. However, most compiler writers consider proofs to be a separate matter, not part of language (exception may be SPARC Ada which IIUC now integrates some proof checking into the compiler). For me CompCert indicates that separation is possible and may be fruitful: people who belive that they write flawless code get optimising compiler which do not bether them with proof, people who want proofs may get them. -- Waldek Hebisch