Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Spiros Bousbouras Newsgroups: comp.compilers Subject: Re: Software proofs, was Are there different programming languages that are compiled to the same intermediate language? Date: Sun, 5 Feb 2023 15:14:02 -0000 (UTC) Organization: A noiseless patient Spider Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <23-02-019@comp.compilers> References: <23-01-092@comp.compilers> <23-02-003@comp.compilers> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="24260"; mail-complaints-to="abuse@iecc.com" Keywords: theory, comment Posted-Date: 05 Feb 2023 13:43:10 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com X-Server-Commands: nowebcancel In-Reply-To: <23-02-003@comp.compilers> X-Organisation: Weyland-Yutani Xref: csiph.com comp.compilers:3365 On 01 Feb 2023 08:07:24 GMT arnold@freefriends.org (Aharon Robbins) wrote: > I've never understood this. Isn't there a chicken and egg problem? > How do we know that the theorem prover is correct and bug free? We prove it by hand ? I don't know if anyone has done this for the provers mentioned but it feels like a feasible task and when it's done , it's done i.e. a theorem prover is not the kind of software which continually gets updated. > [It's a perfectly reasonable question. Alan Perlis, who was my thesis > advisor, never saw any reason to believe that a thousand line proof > was any more likely to be bug-free than a thousand line program. > -John] It depends on what we mean by "bug" , in mathematics we wouldn't say "bug" but "error". First you have *typos* which is situations where one was meaning to , for example , write i and they wrote j instead. It may be that mathematics proofs have more of those than computer programmes because in computer programmes some will be caught by the compiler like for example if j has not been declared. But the essential difference is that a typo is harmless in mathematics , the reader of the proof will either notice and understand what was intended or even unconsciously replace what is written with what was intended. But in computer programmes this kind of typo bug can be catastrophic. Hence in mathematics we wouldn't say that a proof has errors just because it has a typo , we would say it has a typo. For other errors , logic errors , I would guess that computer programmes will have more errors/bugs than mathematical proofs. This assertion is based on personal experience and instinct to some extent , having written both many programmes and many proofs. But I will attempt to give some of what I think are reasons : 1: In mathematics you work with a formalism right from the start but in computer programmes often one has an intuitive understanding of what the program is meant to achieve and a working programme may be considered to have a bug when its behaviour turns out not to match closely enough that intuitive understanding. The difference in the level of formalism is not just about the programmes but about the programming languages too. In standards of programming languages I've read , the language is not described in precise mathematical detail. The grammar usually is but not other parts of the language. For example , from the C standard : The result of the binary + operator is the sum of the operands. Even just for integer arguments , this is nowhere a complete definition but to piece together a complete definition one must combine information scattered in faraway parts of the standard. This would never occur in a mathematics books or paper , the complete definition of a function or operator or whatever (including the precise set for which it is defined) would appear in one place. This sort of thing where one has to piece together information from many parts of the document to arrive at a complete definition , seems to be common in computer language standards and I have no idea why it happens , I don't think anything about specifying a programming language forces things to be done like this. But the result is that programmers will often work with a mix of precise specification and intuitive understanding of the language they're using. 2: In mathematics a correct proof *is* the goal but in computer programming some set of behaviours is the goal and the set of behaviours may not even be the ultimate goal but the desire to make a profit is. For example en.wikipedia.org/wiki/Pac-Man : Due to an integer overflow, the 256th level loads improperly, rendering it impossible to complete. But this didn't prevent the game from being highly successful. "Donkey Kong" also had a bug which prevented progress from some point onwards. What is the goal also influences the kind of people who specialise in computer programming vs those who write mathematical proofs. My impression is that there exist computer programmers who are of the "trial and error" type who write some code which seems to approximate what they have in mind , test it , write some more code and so forth. Eventually they arrive at something which seems to match their mental model and they consider themselves done. They wouldn't have any interest and possibly ability to write a proof of correctness. As a practical matter , they may be perfectly productive and write code which works for the desired cases. 3: There is a kind of bug/error for which there's no mathematical analogue : From "Expert C Programming" by Peter van der Linden : One release of an ANSI C compiler had an interesting bug. The symbol table was accessed by a hash function that computed a likely place from which to start a serial search. The computation was copiously commented, even describing the book the algorithm came from. Unfortunately, the programmer omitted to close the comment! The entire hash initial value calculation thus fell inside the continuing comment, [...] The entire calculation of the initial hash value was omitted, so the table was always searched serially from the zeroth element! As a result, symbol table lookup (a very frequent operation in a compiler) was much slower than it should have been. This was never found during testing because it only affected the speed of a lookup, not the result. This is why some compilers complain if they notice an opening comment in a comment string. The error was eventually found in the course of looking for a different bug. Inserting the closing comment resulted in an immediate compilation speedup of 15%! [The history of formal description of programming languages is not encouraging. Back in the 1970s the ANSI PL/I standard was defined in terms of formal operations on trees, but even so it had bugs/errors. Also, per your comment on addition, in a lot of cases the practical definition turns into whatever the underlying machine's instruction does. Until recently the C standard was deliberately unclear about whether arithmetic was ones- or twos-complement. -John]