Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #2849 > unrolled thread
| Started by | Christopher F Clark <christopher.f.clark@compiler-resources.com> |
|---|---|
| First post | 2022-01-20 21:11 +0200 |
| Last post | 2022-01-22 09:34 +0000 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.compilers
Re: Are compiler developers light-years ahead of other software development? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-01-20 21:11 +0200
Re: Are compiler developers light-years ahead of other software development? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2022-01-21 17:38 +0000
Re: Are compiler developers light-years ahead of other software development? gah4 <gah4@u.washington.edu> - 2022-01-21 17:40 -0800
Re: Are compiler developers light-years ahead of other software development? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2022-01-22 09:34 +0000
| From | Christopher F Clark <christopher.f.clark@compiler-resources.com> |
|---|---|
| Date | 2022-01-20 21:11 +0200 |
| Subject | Re: Are compiler developers light-years ahead of other software development? |
| Message-ID | <22-01-077@comp.compilers> |
Sadly, I would like to offer an alternate perspective on how far compilers are than other areas of software development. Yes, LL and LR parser generators have been around for decades. My compiler class in the late 1970s had me implement both of them. However, shortly after that parser technology was considered basically a settled problem, while in actuality many parser generators are hard to use and people don't understand it when they output state machines and conflict messages freak compiler writers out, etc. As a result, despite all the technology we could be bringing to bare on compilers, as far as I can tell, more than half the compilers in actual use are done via hand-written recursive descent with little hacks here and there to get them to work. The prevalence of "ad hack" compilers should be embarrassing to anyone who wants to talk about software "engineering". There is good theory we could put to use building better compilers and more reliable languages. However, most of it is languishing and going to waste. At least that's what this compiler developer sees when he looks at the landscape. But maybe I am biased in that perspective.... ****************************************************************************** Chris Clark email: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. Web Site: http://world.std.com/~compres 23 Bailey Rd voice: (508) 435-5016 Berlin, MA 01503 USA twitter: @intel_chris ------------------------------------------------------------------------------ [There's a definite "too busy chopping down trees to sharpen our axes" vibe here. -John]
[toc] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
|---|---|
| Date | 2022-01-21 17:38 +0000 |
| Message-ID | <22-01-079@comp.compilers> |
| In reply to | #2849 |
Christopher F Clark <christopher.f.clark@compiler-resources.com> writes: >Yes, LL and LR parser generators have been around for decades. My >compiler class in the late 1970s had me implement both of them. >However, shortly after that parser technology was considered basically >a settled problem, while in actuality many parser generators are hard >to use and people don't understand it when they output state machines >and conflict messages freak compiler writers out, etc. This points to one reason why some people don't use these generators. You need to know quite a bit about the implementation technique behind it to understand what (for LR-based parser generators) a shift-reduce or reduce-reduce conflict means and how to fix that problem. >As a result, despite all the technology we could be bringing to bare >on compilers, as far as I can tell, more than half the compilers in >actual use are done via hand-written recursive descent with little >hacks here and there to get them to work. The "little hacks" point to another reason: How to deal with cases where LL or LR technology, or (probably more often) context-free grammars don't quite fit, e.g., when you want to do a compatible extension of an existing language, like C++ (which was intended to be compatible with C). My impression is that Terrence Parr attacked this problem head-on by formalizing these hacks (to some extent) with syntactic and semantic predicates in PCCTS/ANTLR. There is also the problem of interfacing between the parser and the semantic analysis and back end of the compiler. The yacc interface is not that great, but for some reason improvements such as ox have not taken over the world (maybe because ox is based on attribute grammars which is quite a leap for imperative programmers). Finally, a reason is that some people don't want to depend on software written by others. E.g., Wirth apparently considered it simpler to write a recursive-descent parser by hand than to write a parser generator and use that for implementing a parser. And he even wrote a compiler text book that teaches this approach. - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [next] | [standalone]
| From | gah4 <gah4@u.washington.edu> |
|---|---|
| Date | 2022-01-21 17:40 -0800 |
| Message-ID | <22-01-082@comp.compilers> |
| In reply to | #2850 |
On Friday, January 21, 2022 at 2:30:42 PM UTC-8, Anton Ertl wrote: (snip) > This points to one reason why some people don't use these generators. > You need to know quite a bit about the implementation technique behind > it to understand what (for LR-based parser generators) a shift-reduce > or reduce-reduce conflict means and how to fix that problem. Interesting reason, as I know exactly how I learned about that. (And not from compiler class, where I might have.) In compiling a program, not actually changing it, the instructions warn that it will get a shift-reduce conflict warning, and to ignore it. There is a whole separate thread on ambiguous languages, which is where these come from. In a language with an if-then-optional else construct, and which can be nested, there is an ambiguity in which if an else belongs to. In most such languages it goes to the nearest (deepest nesting) if, and this is the default from the warning. reduce-reduce results from an actual mistake in the language, and does need to be fixed. I believe the program was Kermit, but I am not so sure now the details. I believe that it has an implementation language that translates to C, using a lex/yacc parser. The Makefile does it all for you, but you have to know to ignore the message.
[toc] | [prev] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
|---|---|
| Date | 2022-01-22 09:34 +0000 |
| Message-ID | <22-01-086@comp.compilers> |
| In reply to | #2852 |
gah4 <gah4@u.washington.edu> writes: >On Friday, January 21, 2022 at 2:30:42 PM UTC-8, Anton Ertl wrote: >> You need to know quite a bit about the implementation technique behind >> it to understand what (for LR-based parser generators) a shift-reduce >> or reduce-reduce conflict means and how to fix that problem. > >Interesting reason, as I know exactly how I learned about that. >(And not from compiler class, where I might have.) > >In compiling a program, not actually changing it, the instructions warn >that it will get a shift-reduce conflict warning, and to ignore it. That means that the program's authors have (hopefully) determined that the default resolution (shift) for the shift/reduce conflicts in that program was what they intended. It does not tell you that shift resulution is the intended behaviour for all cases (otherwise there would be no point in reporting the conflict). >There is a whole separate thread on ambiguous languages, which is >where these come from. In a language with an if-then-optional else >construct, and which can be nested, there is an ambiguity in which >if an else belongs to. In most such languages it goes to the nearest >(deepest nesting) if, and this is the default from the warning. That's not the only occurence of shift/reduce conflicts, but it's a relatively prominent one, because many compiler writers leave it to the parser generator to resolve it. However, when developing a grammar, I have seen cases (even for relatively small languages) with many conflicts (both shift/reduce and reduce/reduce). I always change the grammar to eliminate all the conflicts. Often the syntax is changed by these changes, which is outside the scope of pure compiler writers, but inside the scope of language designers. Concerning the Algol 60 case, the syntax is unambiguous (from <https://www.masswerk.at/algol60/report.htm>): |<if clause> ::= if <Boolean expression> then |<unconditional statement> ::= <basic statement> | | <compound statement> | <block> |<if statement> ::= <if clause> <unconditional statement> |<conditional statement> ::= <if statement> | | <if statement> else <statement> | | <if clause> <for statement> | | <label>: <conditional statement> Statements such as if B1 then if B2 then S1 else S1 and even if B1 then if B2 then S1 are syntax errors. You cannot put an if-statement or a conditional statement in a then-branch without wrapping it in a begin...end. Languages like Pascal and C (and its descendants) changed this to allow either, making the grammar ambiguous, and using prose to resolve the ambiguity. E.g., for Pascal: |IfStatement = "if" BooleanExpression "then" Statement [ "else" Statement ]. | |Note: The syntactic ambiguity arising from the construct | |if e1 then if e2 then sl else s2 | |is resolved by interpreting the construct as equivalent to | |if e1 then begin if e2 then sl else s2 end I consider either grammar to be flawed; Algol fixed the flaw in Algol 68, Wirth fixed it in Modula, but the C-syntax descendants have unfortunately never fixed it. <https://en.wikipedia.org/wiki/Dangling_else#Avoiding_the_conflict_in_LR_parsers> describes how to express the intent in the grammar rather than in prose, but looking at the example grammars makes it clear why compiler writers prefer to rely on the parser-generator resolution of the shift/reduce conflict rather than expressing the intent in the grammar. Yacc has precedence and associativity declarations to provide additional information to the parser generator on how to resolve ambiguity in the BNF (and for suppressing the warnings about conflicts that are resolved by these declarations). Something like this might be a good solution for dealing with the dangling else if you cannot avoid it by fixing the syntax. >reduce-reduce results from an actual mistake in the language, >and does need to be fixed. There can be cases where the default behaviour is the intended behaviour, too, but IIRC the default depends on the order of rules, so could be destroyed by a seemingly innocuous change to the parser generator input. And there are (probably many) cases where either reduction is intended in some cases, so the generator default is wrong for some cases for all rule orders. - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/ [The original sin of dangling else may have been COBOL. A 1964 IBM manual for 7080 COBOL says ELSE matches the closest IF. PL/I inherited it shortly aftrwards. -John]
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web