Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.compilers Subject: Re: Some questions about recursive descent? Date: Mon, 28 Feb 2022 07:43:39 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 79 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-02-024@comp.compilers> References: <22-02-021@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="739"; mail-complaints-to="abuse@iecc.com" Keywords: parse, LL(1) Posted-Date: 28 Feb 2022 10:56:12 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:2902 Johann 'Myrkraverk' Oskarsson writes: >My first question is: how are production recursive descent parsers con- >structed? From these two books, and other things I have read in the >past, recursive descent is so simple it's mostly just brushed off with >an example or two, but there seems to be a whole lot more to it. For >instance, [Appel] mentions a parse table but mostly does not explain how >to construct one. I don't know what such a table would be good for. My parser generator Gray , which generates recursive-descent parsers, certainly does not need one. The way Gray works is: It computes first and follow sets for all nodes in the grammar; follow sets are only used for computing first sets and for reporting conflicts, first sets are used for deciding how to parse. It generates code as follows: grammar code terminal if sym in first(terminal) then sym=scan(); else syntax_error(); end a b code(a) code(b) a|b if sym in first(a) then code(a); else code(b); end a? if sym in first(a) do code(a); end a* while sym in first(a) do code(a); end a+ do code(a) while sym in first(a) nonterm call(function(nonterm)) action action a?, a* and a+ are regular right part grammar (RRPG) syntax; they correspond to [a], {a} and a{a} respectively in EBNF. >Second question: why are recursive descent parsers I've come across >always using globals and construct code and/or the parse tree as side >effects, rather than, say, return the parse tree to the caller? I have no explanation for the case of constructing the parse tree; that's trivial to do by returning it. Concerning code generation, constructing it as a string is inefficient with the conventional string representation (array of chars), because concatenation takes time proportional to the length of the string (likewise, if you generate machine code or virtual machine code as an array of bytes). You can now add an unconventional string representation (e.g., a rope), but that still consumes memory; or you just use side effects for outputting the code. The disadvantage of the latter is that once you have output the code, you cannot take it back (or at least not easily), while with strings you can arrange the code for the nodes in a different than processing order, have the code for a node several times etc. without introducing additional complications in your compiler. > Is this something to do with /synthesized attributes/ that [Holub] >talks about? Where the recursive descent parser routines return values >to the caller Synthesized attributes are trivial to pass along in recursive-descent parsers by returning them. You can also trivially do L-attributed grammars by passing attributes inherited from left of a nonterminal as parameters to a rule. > Then I ran into trouble when I tried to make the parser return >the parse tree, rather than construct it with side effects. > Granted, some of my problems are from using a programming language, >ML, that I'm not familiar with I would expect that returning the parse tree is easier in ML than using side effects (especially if you are unfamiliar with ML, because literature about ML probably does not explain side effects early on). If you found the literature you have read up to now to be not satisfactory, maybe Wirth's compiler book is more to your taste. IIRC he explains how to write a recursive-descent parser for PL/0 (a simple Pascal-like language). - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/