Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!news-out.readnews.com!news-xxxfer.readnews.com!news.misty.com!news.iecc.com!.POSTED!nerds-end From: Keshav Pingali Newsgroups: comp.compilers Subject: a new approach to context-free language parsing Date: Sun, 14 Oct 2012 13:34:41 -0500 Organization: Compilers Central Lines: 35 Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <12-10-009@comp.compilers> NNTP-Posting-Host: news.iecc.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: leila.iecc.com 1350250566 16582 64.57.183.58 (14 Oct 2012 21:36:06 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Sun, 14 Oct 2012 21:36:06 +0000 (UTC) Keywords: parse Posted-Date: 14 Oct 2012 17:36:06 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:767 Gianfranco Bilardi and I have developed a new approach to parsing context-free languages that we call "Parsing with pictures". It provides an alternative (and, we believe, easier to understand) approach to context-free language parsing than the standard presentations using derivations or pushdown automata. It also unifies Earley, SLL, LL, SLR, and LR parsers among others. Parsing problems are formulated as path problems in a graph called the grammar flow graph (GFG) that is easily constructed from a given grammar. Intuitively, the GFG is to context-free grammars what NFAs are to regular languages. Among other things, the paper has : (i) an elementary derivation of Earley's algorithm for parsing general context-free grammars, showing that it is an easy generalization of the well-known reachability-based NFA simulation algorithm, (ii) a presentation of look-ahead that is independent of particular parsing strategies, and is based on a simple inter-procedural dataflow analysis, (iii) GFG structural characterizations of LL and LR grammars that are simpler to understand than the standard definitions, and bring out a symmetry between these grammar classes, (iv) derivations of recursive-descent and shift-reduce parsers for LL and LR grammars by optimizing the Earley parser to exploit this structure, and (v) a connection between GFGs and NFAs for regular grammars based on the continuation-passing style (CPS) optimization. You can find the tech report at this URL: http://apps.cs.utexas.edu/tech_reports/reports/tr/TR-2102.pdf We welcome feedback from the community - please email me or Bilardi.