Path: csiph.com!xmission!weretis.net!feeder9.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: antispam@fricas.org
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour in C23
Date: Sat, 23 Aug 2025 15:45:03 -0000
Organization: Compilers Central
Sender: johnl%iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <25-08-013@comp.compilers>
References: <25-08-006@comp.compilers>
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="22288"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 23 Aug 2025 15:03:13 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: csiph.com comp.compilers:3687

Martin Ward <mwardgkc@gmail.com> wrote:
> On 20/08/2025 14:06, John wrote:
>
>> When a language is 50 years old and there is a mountain of legacy code that
>
>> they really don't want to break, it accumulates a lot of cruft. If we were
>> starting now we'd get something more like Go.
>>
>> On the other hand, there's the python approach in which they deprecate and
>> remove little used and crufty features, but old python code doesn't work any
>> more unless you go back and update it every year or two. -John]
>
> Legacy behaviour and undefined behaviour are orthogonal concepts.
>
>
> Any language ought to have fully defined behaviour: even if the
> definition is simply "syntax error" or "the program exits with a
> suitable error message". A language can be extended with new
> constructs and new functions: then the behaviour changes from "syntax
> error" or "error message" to the new functionality.
>
> Whether or not old behaviour is preserved is a completely
> separate issue.

It is rather strange to see you as source of such requirement.
There are both theoretical obstacles and practical ones.
Each language effectively defines theory of computations and
it is well-known that interesting theories are incomplete, that
is do not define everything.  On practical side, error detection
has its costs and some problems like aliasing are so hard to detect
that runtime detection is inpractical.  Trying to define things
is of little use if defined operations do "wrong" things (we
learned this lesson from PL/I).

Problems are most acute for langages that aim at high runtime
efficiency, like C.  For example, using gcc as a backend Pascal
'for' loop which correctly handled both edge cases run at half
speed of similar C loop which ignored edge cases.

AFAIK all lower level languages have some amount of deliberately
undefined behaviour.  Pascal has a lot.  Ada less, but still
it is there.

From my point of view the most promising approach to "fully
defined" language is to require proof that no problematic
situation can occur during execution (and reject programs
with faulty proofs).  But that is too demanding from programmers.
Even if programmer informally can prove that program is correct,
providing machine checkable proof is a lot of work.

However, most compiler writers consider proofs to be a separate
matter, not part of language (exception may be SPARC Ada which
IIUC now integrates some proof checking into the compiler).
For me CompCert indicates that separation is possible and may
be fruitful: people who belive that they write flawless code
get optimising compiler which do not bether them with proof,
people who want proofs may get them.

--
                              Waldek Hebisch