Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!usenet-fr.net!nospam.fr.eu.org!nntpfeed.proxad.net!proxad.net!feeder1-1.proxad.net!198.186.194.250.MISMATCH!news-out.readnews.com!news-xxxfer.readnews.com!news.misty.com!news.iecc.com!nerds-end From: Chris F Clark Newsgroups: comp.compilers Subject: Re: YACCaty YACC - Parsing YACC with YACC Date: Mon, 25 Apr 2011 19:15:56 -0400 Organization: The World Public Access UNIX, Brookline, MA Lines: 49 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-04-037@comp.compilers> References: <11-04-034@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: gal.iecc.com 1303795846 26766 64.57.183.58 (26 Apr 2011 05:30:46 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Tue, 26 Apr 2011 05:30:46 +0000 (UTC) Keywords: yacc, parse Posted-Date: 26 Apr 2011 01:30:46 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:104 There is a grammar for Yacc++ (which includes its lexer, which is in a yacc-like rather than lex-like notation) included with Yacc++. It is the same grammar we use to build the tool itself, stripped of the semantic actions. Being able to compile Yacc++ with itself was an explicit requirement we had from the beginning. ----------------------------------------------------------------------- Personal diatribe warning: Without a hack, one cannot write a plain yacc grammar in yacc, because the semicolon at the end of a rule is optional, one needs to look for identifier : to find the start of a new rule (end of the current one). this means that the grammar for yacc is LALR(2) and not LALR(1). Thus, there are yacc grammars in yacc, but they require this lexical trick to work. The notation for lex is also hard to parse, but mostly because of issues at the lexical level. This is because of the prevalent tendency of the Unix/C community of the time to like 1-character tokens with \ escapes. Most of the shell and make tools are similarly hard to deal with neatly lexically. The general regular expression syntax in Emacs or Perl is also likewise obfuscated, with slight variations that can make them subtly incompatible. However, it is worth noting that the issues with dealing with the lexical issues is balanced by the terseness of the notation and the fact that one can often read them with a C program that looks only at 1 character at a time. It is just a personal foible that I prefer a language that is highly consistent to one which is terse. The other tends to lead to ad hack processing fraught with special cases that are never fully generalized. ----------------------------------------------------------------------- Final note, if you want a copy of Yacc++ for personal [non-commercial] use, send me an email explaining what you want to do with it. I "give away" to individuals a free version with the limitation that any software built with it must be GPLed. Hope this helps, -Chris ****************************************************************************** Chris Clark email: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. Web Site: http://world.std.com/~compres 23 Bailey Rd voice: (508) 435-5016 Berlin, MA 01503 USA twitter: @intel_chris