Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!lnews.iecc.com!nerds-end From: "Armel" Newsgroups: comp.compilers Subject: Re: coupling LALR with a scanner? Date: Sun, 2 Oct 2011 16:41:08 +0200 Organization: les newsgroups par Orange Lines: 17 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-10-005@comp.compilers> References: <11-07-013@comp.compilers> <11-07-015@comp.compilers> <11-07-018@comp.compilers> <11-08-004@comp.compilers> <11-09-016@comp.compilers> <11-09-017@comp.compilers> <11-09-022@comp.compilers> <11-09-023@comp.compilers> <11-10-003@comp.compilers> NNTP-Posting-Host: lnews.iecc.com X-Trace: gal.iecc.com 1317610133 54156 64.57.183.34 (3 Oct 2011 02:48:53 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Mon, 3 Oct 2011 02:48:53 +0000 (UTC) Keywords: parse Posted-Date: 02 Oct 2011 22:48:53 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:287 > there is an argument for introducing such a gap to "segment" > tokens into smaller chunks for both performance and expressibility > reasons. could you elaborate on this segmentation mechanism? In my lexer generator, the developer can introduce start states by himself and 'cut' complex expressions into smaller expressions which still respect AFD capabilities and introduce dynamic regular expressions where absolutely necessary, for languages allowing dynamic string delimiters for example (e.g. doc-strings like << END_OF_STR_MARKER, some lines then a line with END_OF_STR_MARKER only on the line), languages such as ruby are very funny from that point of view if I remember well. Regards Armel