Path: csiph.com!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Tim Rentsch
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Tue, 12 Sep 2023 10:32:55 -0700
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <86pm2nmh4o.fsf@linuxsc.com>
References: <20230901114625.198@kylheku.com> <20230901135123.702@kylheku.com> <878r9p7b13.fsf@nosuchdomain.example.com> <20230901175635.91@kylheku.com> <87edjeujqw.fsf@bsb.me.uk> <87tts9sbo8.fsf@bsb.me.uk> <87pm2wq8se.fsf@bsb.me.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="2b3a0b5d959fa737f816f5ed83248a84"; logging-data="1757939"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX181d3cQxt5Ck95+jAaNpCS5SuTJLI5R8Go="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:4CxkrTiKJwzJHti1zs3/ccJNzuM= sha1:ynNinyXRR582quWfpgUgq/ql6mo=
Xref: csiph.com comp.lang.c:175256
Ben Bacarisse writes:
> This text:
>
> int main(void) { int a[] = {1}; return a[2]; }
>
> is meaningless (as C). It does not "do" anything because it
> means nothing.
I would like to offer a different perspective on this question.
I take the meaning of a C program to be a mapping from inputs
to outcomes. The term outcome is meant to be "the result" of
giving a particular input to a program, without being too
specific about what counts as part of a result. For example,
"outcome" includes the observable behavior of a program, but it
also includes exit status, which is not part of the observable
behavior. (I needed to check the definition of observable
behavior to verify that.)
The first point is that outcome is multi-valued: a result might
be a single outcome, or it might be a set of possible outcomes.
A program that relies on unspecified behavior can produce
different results depending on what choices are made in each case
where the program depends on unspecified behavior. (Side point:
I think we can treat implementation-defined behavior as simply
one more component of program input. In any case we will not
consider it further, as it shouldn't cause any serious problems.)
Viewed from this perspective, the meaning of the program above
is a set containing all possible outcomes (and incidentally that
is the result for all inputs). Of course it can be the case that
some programs have infinite outcome sets for some inputs and
finite outcome sets for other inputs. It may be surprising but
there is actually some positive information content in saying the
meaning of a program is a set of all possible outcomes. To see
that, consider this program:
int main(void) { return 1; };
No doubt there are different points of view about what the outcome
set of this program should be, but we can be sure of one thing: if
there are any elements in this program's outcome set, every one
includes at least one diagnostic message, caused by the syntax
error. So even if the outcome set is infinite, it is a proper
subset of the outcome set of the previous program with undefined
behavior (and no syntax errors or constraint violations), because
some of those outcomes do not include any diagnostics.
My inclination here is to say the second program is "meaningless"
whereas the first program does have a "meaning", even if not an
especially useful one. That view seems consistent with what is
said in the ISO C standard, the very first sentence of which
reads
This International Standard specifies the form and
establishes the interpretation of programs expressed
in the programming language C.
The "meaning" of a proprosed program text is what interpretation is
specified for it. The second program doesn't satisfy the form of
programs written in C, and hence does not have an interpretation
specified for it: meaningless. The first program does satisfy the
form of programs written in C, and so does have an interpretation
specified for it, even though what is specified is unboundedly
liberal. Furthermore I think the distinction described corresponds
to how we normally think of the words used. The second program is
"meaningless" no matter what input it is given. But it's fairly
easy to construct programs that have undefined behavior only on
very large inputs (over, say, 2**128 characters). It seems wrong
to call a program "meaningless" if its behavior is well-defined for
all inputs it will ever be run on. It's much nicer to say that such
a program is meaningful, with the outcome sets for some inputs being
infinite.
So for what it's worth, there is another taken on the matter.