Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED!nerds-end
From: hsad005@gmail.com
Newsgroups: comp.compilers
Subject: =?UTF-8?Q?Bison_determinis=E2=80=8Btic_LALR=281=29_parser_for_Java=2FC=2B=2B_=28?= =?UTF-8?Q?kind_of_complex_langauge=29_without_=27lexar_hack=27_support?=
Date: Fri, 17 Aug 2012 11:22:38 -0700 (PDT)
Organization: Compilers Central
Lines: 20
Sender: johnl@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <12-08-005@comp.compilers>
NNTP-Posting-Host: news.iecc.com
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Trace: leila.iecc.com 1345253482 44686 64.57.183.58 (18 Aug 2012 01:31:22 GMT)
X-Complaints-To: abuse@iecc.com
NNTP-Posting-Date: Sat, 18 Aug 2012 01:31:22 +0000 (UTC)
Keywords: bison, question
Posted-Date: 17 Aug 2012 21:31:22 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: csiph.com comp.compilers:722

I need to write a parser for a programming langauge which is as
complex as C++/Java, and to even complicate the matter, there are
constructs in this langauge that doesn't allow me to use
type/identifier dis-ambiguating lexer hack. In other words, I will
have to return just one lexical token (say IDENTIFIER) from the lexer
for both type references as well as non-type variable references.

Given these restrictions, I was wondering if it would be a good idea
to pick yacc/bison for my parser...? Or, should I consider a hand
written recursive descent parser.

Regards.
[Get it working in bison, then in the unlikely event that's not fast
enough, profile your compiler to see where it's spending its time and
fix what needs to be fixed.  Although in theory GLR can be very slow,
in practice the ambiguities are generally resolved within a few tokens
and the performance is fine.  compilers always spend way more time in
the lexer than the parser anyway. Writing RD parsers by hand can be
fun, but you never know what language it actually parses. -John]