Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: anton@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Subject: Re: A minimal LL(1) parser generator ?
Date: Thu, 26 Dec 2019 16:21:29 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 86
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <19-12-030@comp.compilers>
References: <19-12-016@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="96917"; mail-complaints-to="abuse@iecc.com"
Keywords: LL(1), tools
Posted-Date: 26 Dec 2019 12:12:38 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: csiph.com comp.compilers:2411

Andy <borucki.andrzej@gmail.com> writes:
>ANTLR has even LL(*) but is too complicated. I am searching maximal
>simple and elegant generator which generates function call like
>written by hand.

Sounds like you want a generator that generates a recursive-descent
parser.  A while ago I compared a number of parser generators
[ertl99ef], and among those that generate recursive-descent parsers
the shortest one by far was Gray
<http://www.complang.tuwien.ac.at/forth/Gray5.zip>.  Whether you
consider it simple and elegant is for you to decide.

@InProceedings{ertl99ef,
  author = 	 {M. Anton Ertl},
  title = 	 {Is {Forth} Code Compact? {A} Case Study},
  booktitle = 	 {EuroForth'99 Conference Proceedings},
  year =	 {1999},
  address =	 {St. Petersburg, Russia},
  url = 	 {http://www.complang.tuwien.ac.at/papers/ertl99ef.ps.gz},
  abstract = 	 {Forth advocates often claim that Forth code is
                  smaller, faster, and requires less development time
                  than equivalent programs in other languages.  This
                  paper investigates this claim by comparing a number
                  of parser generators written in various languages
                  with respect to source code size.  The smallest
                  parser generator (14 lines) in this comparison is
                  written in Forth, and the other Forth program is
                  smaller than the others in its class by a factor of
                  8 or more; however, the Forth programs do not have
                  all the features of their counterparts.  I took a
                  closer look at Gray (in Forth) and Coco/R (in
                  Modula-2) and found that several Forth features
                  missing from Modula-2 give Gray more than a factor
                  of three advantage over Coco/R (even if the other
                  size differences were solely due to differences in
                  functionality): run-time code generation; access to
                  the parser and a simple, flexible syntax; and
                  Forth's dictionary.}
}

Concerning the other question that came up, about dealing with left
recursion, and building the abstract syntax tree (AST) appropriately,
that part works nicely in Gray:

In BNF you would write

expr->expr '-' number
expr->number

(and the parse tree would have the same structure as the AST).

Gray uses a variant of regular right part grammars, and you would
write that as a pure parsing rule as follows:

(( number (( '-' number )) ** )) <- expr

For buiding the AST, let's assume that "number" leaves a tree node on
the stack, and we have a word

new-bin-node ( node1 node2 -- node3 )

(i.e., it pops node1 and node2 from the stack, and pushes node3).
Then we can extend the above as follows:

(( number (( '-' number {{ new-bin-node }} )) ** )) <- expr ( -- node )

(The "( -- node )" comment documents the stack effect of parsing "expr").

Passing the nodes on Forth's stack rather than using pseudo-variables
like yacc's $1 etc. works nicely when going beyond BNF.  Now I wonder
how Algol-based EBNF parser generators do it, but am too lazy to
actually research it.

Getting a right-recursive tree is actually slightly harder: Now we
actually have to perform recursion (non-recursive variants exist, but
are not simpler).  Let's have a similar example, but with the
right-associative operator '^'.

nonterminal expr ( -- node ) \ declare before use
(( number (( '^' expr {{ new-bin-node }} )) ?? )) expr rule

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/