Path: csiph.com!news.mixmin.net!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Tim Rentsch Newsgroups: comp.lang.c Subject: Re: bart again (UCX64) Date: Tue, 12 Sep 2023 10:32:55 -0700 Organization: A noiseless patient Spider Lines: 76 Message-ID: <86pm2nmh4o.fsf@linuxsc.com> References: <20230901114625.198@kylheku.com> <20230901135123.702@kylheku.com> <878r9p7b13.fsf@nosuchdomain.example.com> <20230901175635.91@kylheku.com> <87edjeujqw.fsf@bsb.me.uk> <87tts9sbo8.fsf@bsb.me.uk> <87pm2wq8se.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: dont-email.me; posting-host="2b3a0b5d959fa737f816f5ed83248a84"; logging-data="1757939"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX181d3cQxt5Ck95+jAaNpCS5SuTJLI5R8Go=" User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux) Cancel-Lock: sha1:4CxkrTiKJwzJHti1zs3/ccJNzuM= sha1:ynNinyXRR582quWfpgUgq/ql6mo= Xref: csiph.com comp.lang.c:175256 Ben Bacarisse writes: > This text: > > int main(void) { int a[] = {1}; return a[2]; } > > is meaningless (as C). It does not "do" anything because it > means nothing. I would like to offer a different perspective on this question. I take the meaning of a C program to be a mapping from inputs to outcomes. The term outcome is meant to be "the result" of giving a particular input to a program, without being too specific about what counts as part of a result. For example, "outcome" includes the observable behavior of a program, but it also includes exit status, which is not part of the observable behavior. (I needed to check the definition of observable behavior to verify that.) The first point is that outcome is multi-valued: a result might be a single outcome, or it might be a set of possible outcomes. A program that relies on unspecified behavior can produce different results depending on what choices are made in each case where the program depends on unspecified behavior. (Side point: I think we can treat implementation-defined behavior as simply one more component of program input. In any case we will not consider it further, as it shouldn't cause any serious problems.) Viewed from this perspective, the meaning of the program above is a set containing all possible outcomes (and incidentally that is the result for all inputs). Of course it can be the case that some programs have infinite outcome sets for some inputs and finite outcome sets for other inputs. It may be surprising but there is actually some positive information content in saying the meaning of a program is a set of all possible outcomes. To see that, consider this program: int main(void) { return 1; }; No doubt there are different points of view about what the outcome set of this program should be, but we can be sure of one thing: if there are any elements in this program's outcome set, every one includes at least one diagnostic message, caused by the syntax error. So even if the outcome set is infinite, it is a proper subset of the outcome set of the previous program with undefined behavior (and no syntax errors or constraint violations), because some of those outcomes do not include any diagnostics. My inclination here is to say the second program is "meaningless" whereas the first program does have a "meaning", even if not an especially useful one. That view seems consistent with what is said in the ISO C standard, the very first sentence of which reads This International Standard specifies the form and establishes the interpretation of programs expressed in the programming language C. The "meaning" of a proprosed program text is what interpretation is specified for it. The second program doesn't satisfy the form of programs written in C, and hence does not have an interpretation specified for it: meaningless. The first program does satisfy the form of programs written in C, and so does have an interpretation specified for it, even though what is specified is unboundedly liberal. Furthermore I think the distinction described corresponds to how we normally think of the words used. The second program is "meaningless" no matter what input it is given. But it's fairly easy to construct programs that have undefined behavior only on very large inputs (over, say, 2**128 characters). It seems wrong to call a program "meaningless" if its behavior is well-defined for all inputs it will ever be run on. It's much nicer to say that such a program is meaningful, with the outcome sets for some inputs being infinite. So for what it's worth, there is another taken on the matter.