Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED!nerds-end From: Hans-Peter Diettrich Newsgroups: comp.compilers Subject: Re: Bison =?UTF-8?B?ZGV0ZXJtaW5pc+KAi3RpYyBMQUxSKDEpIHBhcnNlciBm?= =?UTF-8?B?b3IgSmF2YS9DKysgKGtpbmQgb2YgY29tcGxleCBsYW5nYXVnZSkgd2l0aG91dCA=?= =?UTF-8?B?J2xleGFyIGhhY2snIHN1cHBvcnQ=?= Date: Sat, 18 Aug 2012 10:13:46 +0100 Organization: Compilers Central Lines: 41 Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <12-08-006@comp.compilers> References: <12-08-005@comp.compilers> NNTP-Posting-Host: news.iecc.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: leila.iecc.com 1345410035 46463 64.57.183.58 (19 Aug 2012 21:00:35 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Sun, 19 Aug 2012 21:00:35 +0000 (UTC) Keywords: bison, design, comment Posted-Date: 19 Aug 2012 17:00:35 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:723 hsad005@gmail.com schrieb: > I need to write a parser for a programming langauge which is as > complex as C++/Java, and to even complicate the matter, there are > constructs in this langauge that doesn't allow me to use > type/identifier dis-ambiguating lexer hack. Why don't you fix your language, and remove such ambiguities? Look at Pascal or other Wirthian languages... > In other words, I will > have to return just one lexical token (say IDENTIFIER) from the lexer > for both type references as well as non-type variable references. This shouldn't be a big problem, as long as the parser does not rely on such a distinction. Once a symbol has been defined, it can contain some indication about its nature. > Given these restrictions, I was wondering if it would be a good idea > to pick yacc/bison for my parser...? Or, should I consider a hand > written recursive descent parser. I don't see how this decision is related to above problem. > Regards. > [Get it working in bison, then in the unlikely event that's not fast > enough, profile your compiler to see where it's spending its time and > fix what needs to be fixed. Although in theory GLR can be very slow, > in practice the ambiguities are generally resolved within a few tokens > and the performance is fine. compilers always spend way more time in > the lexer than the parser anyway. Writing RD parsers by hand can be > fun, but you never know what language it actually parses. -John] There exist parser generators for several models. I also doubt that - except in misdesigned C-ish languages - a compiler spends significant time in the lexer. This may be true for dummy parsers, which do nothing but syntax checks, but not for compilers with code generation, optimization and more. DoDi [Compilers spend a lot of time in the lexer, because that's the only phase that has to look at the input one character at a time. -John]