Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!nerds-end From: "Armel" Newsgroups: comp.compilers Subject: Re: coupling LALR with a scanner? Date: Fri, 16 Sep 2011 10:47:24 +0200 Organization: les newsgroups par Orange Lines: 25 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-09-017@comp.compilers> References: <11-07-013@comp.compilers> <11-07-015@comp.compilers> <11-07-018@comp.compilers> <11-08-004@comp.compilers> <11-09-016@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: gal.iecc.com 1316287078 65102 64.57.183.58 (17 Sep 2011 19:17:58 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Sat, 17 Sep 2011 19:17:58 +0000 (UTC) Keywords: parse Posted-Date: 17 Sep 2011 15:17:57 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:267 >I don't understand why you want to have a different scanner for each >state. The parser can easily make the decision, whether a token is >valid. In fact, an LALR parser already has this information in the >parser tables, so why make a simple situation complicated? IELR was exactly made for that reason, as a first step to PSLR: some grammars have no 'tokens' and 'grammar rules', they just have a 'grammar' where mutually exclusive tokens are present, e.g. you cannot make a Javascript single lexer as there are state where / (slash) means 'start of regular expression' (of course the content of the regular expression follows totally different lexing rules than the rest of the text) whereas in other states it means 'division' operator. If your parser cannot tell which of the two lexers to use, you are off. > this could be confusing to the the user of your language. people like Perl and Javascript, so we have to make parsers for those languages :) by the way I'd be open to use IELR, but I have to read the paper again because it's far from being easy to understand and implement... Armel [This seems like an awfully complicated way to reinvent scanner start states. -John]