Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4 Newsgroups: comp.compilers Subject: Re: STEP compiler generator Date: Tue, 21 Jun 2022 18:53:40 -0700 (PDT) Organization: Compilers Central Lines: 127 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-06-069@comp.compilers> References: <22-06-045@comp.compilers> <22-06-062@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="28870"; mail-complaints-to="abuse@iecc.com" Keywords: tools, macros, comment Posted-Date: 21 Jun 2022 22:04:33 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-06-062@comp.compilers> Xref: csiph.com comp.compilers:3090 On Tuesday, June 21, 2022 at 4:24:10 PM UTC-7, Christopher F Clark wrote: (snip, I wrote) > > In one program it has a parser generator, interpreter for the > > generated parser, replacement procedure (what Bison calls action) > > compiler, and interpreter for compiled replacement procedures. > This sounds very much like the model used in Racket, where one > incrementally defines a new language which gets interpreted down to scheme > and executed as scheme (and Racket itself is written in scheme, in that way > if I understand correctly). This is all done by what the lisp people call > "hygenic macros" which are ways of manipulating an AST that has been > represented as S-expressions. I would have to look at in detail, but it does sound similar. > With a C interpreter (and there are such things) and a lexer and parser > generator written in C, one could essentially do the same thing, the same > way. I think I have heard rumors about C interpreters, but never saw one. > However, as our routinely wise moderator points out, the result is an > idiosyncratic language that is one of a kind and no one but the author > really understands. This is truly how to build a tower of Babel. Well, you can say that about C with some uses of the preprocessor. (There is an old story, maybe untrue, about a Pascal user doing: #define BEGIN { #define END } I believe the idea is to (mostly) not do that, at least not more than one does with the C preprocessor. I was remembering in another group, about how Pascal has numeric statement labels, and one of the features of Knuth's WEB, used to write TeX, is macros to give them nice names. You could do that with other languages, and with a very simple processor, maybe even sed. Hopefully it makes the code more readable, but maybe not. In any case, it was designed around the time of Mortran, and one suggested use is an improved Mortran processor. Fully parsing input allows for more appropriate error messages, for one. > More importantly, if you are trying to solve the problem of writing more of > a compiler as something one can generate, you haven't actually solved any > interesting problem. Your code will still be imperative. You haven't > introduced any new model that actually makes some part of the compilation > process easier to reason about. Mostly I believe it is useful as a source to source compiler, and especially one that can be written fast, maybe to be used once. (snip) > Attribute grammars are another way of reducing cognitive load, if you make > the attribute expressions separate and independent. That means in each > attribute expression you are only thinking about one issue. Only when the > attribute expressions intersect or are dependent upon each other does the > reasoning become more complex. > Now, a different interpretation of what you are trying to achieve is some > kind of portability. Starting with either a lisp/scheme or C interpreter, > you would have something portable and relatively easy to convert into some > other programming language. Because you would still be interpreting it, it > would be a complete bootstrapping effort, where the output of the > compilation was a translation of the original language to a different > language. Yes, I believe you could take much of bison, and much of a C interpreter, and put them together into one program. > However, the reason translations from one programming language to another > is difficult is not about the ordinary code. That part is easy. It's > about the semantic edge cases and things like I/O where the semantics are > buried in some runtime library. When I was in high school, I was interested in having a BASIC to Fortran converter. I had some BASIC programs that were interesting. Then I read the BNF description of HP-2000 BASIC, and, without knowing anything else about compilers, thought about how to write a recursive descent compiler. When I was in college, the one class I really wanted to take, was the compiler class. Putting EE classes and the compiler class on my suggested schedule got me into applied physics. In any case, with such a processor one can fully parse the input, and do some translations that would otherwise be difficult. But yes, I suspect that many languages would not be so easy. > If two numbers added together overflow, > what happens? What happens if you index off the edge of an array? What > data types convert into each other and what is the result of the > conversion? There are myriads of questions at that level, which determine > what programs actually do and which ones are legal. That's where UNCOL > projects die. Reminds me, a few times I have complained on comp.lang.fortran about the removal of REAL variables in DO loops. They were added, and then removed. But for translation from BASIC, especially by hand, it is nice to have them. (Many BASIC systems only have floating point variables.) The current STEP only has the output processor that formats for fixed-form Fortran. The manual says that there is another one, but I don't see it. Might not be hard to write, though. Otherwise, some of the difference between STEP and Bison is the difference in input representation. For one, STEP generates a recursive descent compiler, so there are specific ways to write the macros to avoid head recursion. STEP was written close to the time of yacc, so there might not have been a lot of cross fertilization. [For something much worse than BEGIN and END, the Bourne shell was written with macros that made C look remarkably like Algol 68. See https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh For a C interpreter see http://www.softintegration.com/products/ The eel extension language in emacs clone Epsilon is very much like C also. -John]