Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #3572 > unrolled thread
| Started by | John R Levine <johnl@taugh.com> |
|---|---|
| First post | 2024-06-10 14:21 +0200 |
| Last post | 2024-06-12 11:27 +0200 |
| Articles | 8 — 5 participants |
Back to article view | Back to comp.compilers
Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages John R Levine <johnl@taugh.com> - 2024-06-10 14:21 +0200
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages Jon Chesterfield <jonathanchesterfield@gmail.com> - 2024-06-10 19:20 +0100
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages Derek <derek-nospam@shape-of-code.com> - 2024-06-11 00:28 +0100
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages anton@mips.complang.tuwien.ac.at - 2024-06-11 07:57 +0000
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages Derek <derek-nospam@shape-of-code.com> - 2024-06-11 22:45 +0100
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages anton@mips.complang.tuwien.ac.at - 2024-06-14 16:00 +0000
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages Derek <derek-nospam@shape-of-code.com> - 2024-06-10 20:30 +0100
Re: Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2024-06-12 11:27 +0200
| From | John R Levine <johnl@taugh.com> |
|---|---|
| Date | 2024-06-10 14:21 +0200 |
| Subject | Compilation Quotient (CQ): A Metric for the Compilation Hardness of Programming Languages |
| Message-ID | <24-06-003@comp.compilers> |
This preprint from TU Delft and ETH Zurich generates small programs from the grammars of several popular programs, and calculates CQ, which is roughly the percentage (0-100) that compile, intended as a proxy for how hard the languages are to write. C has a CQ of 48, Rust barely above zero. In the discussion at the end they say "A programmer's task is to write programs that compile." which I think summarizes the basic problem with the paper. Take a look. https://arxiv.org/abs/2406.04778 Regards, John Levine, johnl@taugh.com, Taughannock Networks, Trumansburg NY Please consider the environment before reading this e-mail. https://jl.ly
[toc] | [next] | [standalone]
| From | Jon Chesterfield <jonathanchesterfield@gmail.com> |
|---|---|
| Date | 2024-06-10 19:20 +0100 |
| Message-ID | <24-06-005@comp.compilers> |
| In reply to | #3572 |
Curious paper, thank you. The probability that a program generated by the grammar fails semantic analysis does seem an interesting value. Estimating it by sampling from a property based tester seems reasonable too. I don't think this says anything meaningful about the experience of programming in one of these as grammar and sema errors are both reported early. It probably does indicate cases that a given language could detect earlier by changing their grammar. Jon [I had two other thoughts. One was that you can tell C was written when parsing was still hard enough that you didn't want to bulk the parsers up with semantic stuff. The other was that in the languages where it is hard to write a valid problem, how much more likely is it that the program actually works once you get it to compile? -John]
[toc] | [prev] | [next] | [standalone]
| From | Derek <derek-nospam@shape-of-code.com> |
|---|---|
| Date | 2024-06-11 00:28 +0100 |
| Message-ID | <24-06-009@comp.compilers> |
| In reply to | #3573 |
John, > [I had two other thoughts. One was that you can tell C was written when > parsing was still hard enough that you didn't want to bulk the parsers > up with semantic stuff. The other was that in the languages where it is > hard to write a valid problem, how much more likely is it that the program > actually works once you get it to compile? -John] C was created after Algol 68, whose 2-level grammar contained syntax+semantics. Algol 68 programs automatically generated from the language grammar should compile just fine. I suspect that output would be rare, because generating the code needed to produce output would be uncommon, and the path to it being the end result of a drunkards walk. C had a kind-of conventional grammar, where-as Algol 68 grammar is certainly not conventional (it might even be unique). [I never heard of any other language using VW-grammars. In C's defense, the early compilers -John]
[toc] | [prev] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at |
|---|---|
| Date | 2024-06-11 07:57 +0000 |
| Message-ID | <24-06-011@comp.compilers> |
| In reply to | #3573 |
John Levine: >[I had two other thoughts. One was that you can tell C was written when >parsing was still hard enough that you didn't want to bulk the parsers >up with semantic stuff. To me it looks the other way 'round: syntax specification formalisms such as BNF inspired programming language designers to put a lot of stuff in syntax, because that was formal. E.g., Algol 60 differentiates between booleans and other values on the syntax level. Algol 68 introduced Van Wijngaarden grammars to specify the type system and the syntax in one syntactic formalism. Other, later languages have reduced the scope of syntax (often only slightly), and specify the type system as a separate entity. Interestingly, I am not aware of a widely successful formalism for type systems, even though many programming languages specify static type systems and their implementations have to perform static type checking (plus there is also dynamic type checking). >The other was that in the languages where it is >hard to write a valid program, how much more likely is it that the program >actually works once you get it to compile? -John] That is the promise of programming langauges that make it hard to get a program to compile: get it to compile, and it is usually correct. I am not aware of any empirical evidence that supports this promise. - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [next] | [standalone]
| From | Derek <derek-nospam@shape-of-code.com> |
|---|---|
| Date | 2024-06-11 22:45 +0100 |
| Message-ID | <24-06-014@comp.compilers> |
| In reply to | #3576 |
John, Anton, >> The other was that in the languages where it is >> hard to write a valid program, how much more likely is it that the program >> actually works once you get it to compile? -John] > > That is the promise of programming langauges that make it hard to get > a program to compile: get it to compile, and it is usually correct. I > am not aware of any empirical evidence that supports this promise. Requiring that variables are defined before use decreases incorrectness (which is not a marketable term). There is a tiny amount of evidence that strong typing may be a benefit https://shape-of-code.com/2014/08/27/evidence-for-the-benefits-of-strong-typing-where-is-it/ cost effectiveness of benefits is a question that researchers avoid (it smacks of grubby usefulness). If you are interested in evidence, check out My book, Evidence-based Software Engineering, which discusses what is currently known about software engineering, based on an analysis of all the publicly available data pdf+code+all data freely available here: http://knosof.co.uk/ESEUR/ If you know of any interesting software engineering data that I don't have, please tell me about it.
[toc] | [prev] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at |
|---|---|
| Date | 2024-06-14 16:00 +0000 |
| Message-ID | <24-06-019@comp.compilers> |
| In reply to | #3578 |
Derek <derek-nospam@shape-of-code.com> writes: >> That is the promise of programming langauges that make it hard to get >> a program to compile: get it to compile, and it is usually correct. I >> am not aware of any empirical evidence that supports this promise. > >Requiring that variables are defined before use >decreases incorrectness (which is not a marketable term). It's not hard to get a program to compile if the compiler requires definition before use. The languages for which I have heard the claim the most are Haskell and Rust. I remember talking at a conference to someone who worked on the register allocator of IIRC SML/NJ (ML is an eager language on which the syntax and type system of Haskell are based AFAICT), and it did not sound like the promise had been achieved. I also wonder how all the correctness criteria of a register allocator could be modeled as Haskell or Rust types. >If you are interested in evidence, check out >My book, Evidence-based Software Engineering, which >discusses what is currently known about software engineering, >based on an analysis of all the publicly available data >pdf+code+all data freely available here: >http://knosof.co.uk/ESEUR/ Cool book. If only I had more time to read all the interesting books. - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [next] | [standalone]
| From | Derek <derek-nospam@shape-of-code.com> |
|---|---|
| Date | 2024-06-10 20:30 +0100 |
| Message-ID | <24-06-006@comp.compilers> |
| In reply to | #3572 |
John, > This preprint from TU Delft and ETH Zurich generates small programs from > the grammars of several popular programs, and calculates CQ, which is > roughly the percentage (0-100) that compile, intended as a proxy for how > hard the languages are to write. C has a CQ of 48, Rust barely above > zero. The paper Programming Languages vs. Fat Fingers https://www2.dmst.aueb.gr/dds/blog/20121205/index.html made small changes to existing code, in various languages, and then measured how many compiled, ran and produced the correct output.
[toc] | [prev] | [next] | [standalone]
| From | Hans-Peter Diettrich <DrDiettrich1@netscape.net> |
|---|---|
| Date | 2024-06-12 11:27 +0200 |
| Message-ID | <24-06-016@comp.compilers> |
| In reply to | #3572 |
On 6/10/24 2:21 PM, John R Levine wrote: > generates small programs from > the grammars of several popular programs, I think that the *syntactic grammar* of program *languages* is meant: >> The key idea is to measure the compilation success rates of programs sampled from context-free grammars. << Then I wonder how ever valid random programs can be generated for languages that require a declaration before use of an identifier, clearly a *semantic* issue. A CQ of 40 for C indicates to me that certain semantic rules have been built into the program generator. Or what did I not understand right? DoDi [The paper describes the grammars they use. C grammar requires declarations precede other statements so that's easy to get right. -John]
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web