Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Alain Ketterlin Newsgroups: comp.compilers Subject: Re: Some questions about recursive descent? Date: Mon, 28 Feb 2022 21:52:54 +0100 Organization: =?utf-8?Q?Universit=C3=A9?= de Strasbourg Lines: 60 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-02-026@comp.compilers> References: <22-02-021@comp.compilers> <22-02-024@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="15798"; mail-complaints-to="abuse@iecc.com" Keywords: parse, LL(1), comment Posted-Date: 28 Feb 2022 16:19:54 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:2904 anton@mips.complang.tuwien.ac.at (Anton Ertl) writes: > Johann 'Myrkraverk' Oskarsson writes: >>My first question is: how are production recursive descent parsers con- >>structed? From these two books, and other things I have read in the >>past, recursive descent is so simple it's mostly just brushed off with >>an example or two, but there seems to be a whole lot more to it. For >>instance, [Appel] mentions a parse table but mostly does not explain how >>to construct one. > > I don't know what such a table would be good for. My parser generator > Gray , which > generates recursive-descent parsers, certainly does not need one. A parse table for recursive descent/LL(1) simply tells which rule to expand given the non-terminal currently parsed and the incoming token type. The building of this table is described in, e.g., the dragon book (for sure in the 2nd edition). > The way Gray works is: It computes first and follow sets for all nodes > in the grammar; follow sets are only used for computing first You probably mean the reverse (first is used to compute follow). > sets and for reporting conflicts, first sets are used for deciding how > to parse. It generates code as follows: > > grammar code > terminal if sym in first(terminal) then sym=scan(); else syntax_error(); end > a b code(a) code(b) > a|b if sym in first(a) then code(a); else code(b); end > a? if sym in first(a) do code(a); end > a* while sym in first(a) do code(a); end > a+ do code(a) while sym in first(a) > nonterm call(function(nonterm)) > action action The parsing table embodies all that logic. Parsing works by maintaining a stack of symbols, and the table indicates what actions need to be taken (either replace one non-terminal with a rhs on the stack, or match one input symbol). The algorithm works with any table, and the whole thing (algorithm+table) is certainly much shorter than generating code for each and every rule. EBNF constructions like ?/*/+/... have to be translated first into "regular" rules, but that's trivial pre-processing (along with left factoring/recursivity elimination). Building a parse tree in this context is less obvious, because non-terminal symbols seem to "disappear" during the parsing. The solution (at least the one I came up with) consists in keeping special entries on the stack signaling the end of a rhs. Then, special actions (i.e., "reduce") can be taken when finding these special entries during parsing, much in the same way as it is done in bottom-up (e.g., LR) parsing: the "semantic values" of the various symbols can be pushed onto a separate stack, and you end up with something really similar to LR-parsing. -- Alain. [I think the point is that if you have a table, you have a table driven LL(1) parser rather than recursive descent in which the grammar is embedded in the code. -John]