Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: "Ev. Drikos" Newsgroups: comp.compilers Subject: Scannerless parsing was: Why does the lexer convert text integer lexemes ...? Date: Thu, 21 Jul 2022 13:41:25 +0300 Organization: Aioe.org NNTP Server Lines: 47 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-07-042@comp.compilers> References: <22-07-011@comp.compilers> <22-07-030@comp.compilers> <22-07-036@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="7401"; mail-complaints-to="abuse@iecc.com" Keywords: parse, design, comment Posted-Date: 21 Jul 2022 12:16:52 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Content-Language: en-US Xref: csiph.com comp.compilers:3132 On 18/07/2022 06:39, gah4 wrote: > [In my experience separating the lexer from the parser makes it a lot easier > to deal with common lexical situations like skipping white space and comments. > You could certainly do that in a combined scheme but I'm not sure it would end > up any simpler. -John] Maybe not simpler, but it won't be necessarily more complex. I've just transferred an example for SQL from my old desktop. The FSA/GLR parser built can parse ie this command, without a scanner, unambiguously in the simulator: SELECT ALL FIRST,LAST FROM USERS; Below, there are four BNF rules. The first has in the right part the required lookahead operator '-:' as the grammar allows consecutive IDs. The second rule may be seen as a meta-rule that will be expanded. That is, the Builder will make the 'dirty' job for the programmer and attach the separator to each element listed in the 4th BNF rule below, which in turn must have only one symbol in each alternative (in the right side). IMHO, this grammar doesn't look very complex, but others may see it so. Regards, Ev. Drikos ------------------------------------------------------------------------ ::= [ ... ] -: () ::= ( ) | ( ) ::= { | }... ::= | .. ... | | [I don't see how this grammar will allow a comment before the statement. -John]