Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: anton@mips.complang.tuwien.ac.at (Anton Ertl) Newsgroups: comp.compilers Subject: Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? Date: Sun, 12 Jun 2022 14:10:43 GMT Organization: Institut fuer Computersprachen, Technische Universitaet Wien Lines: 39 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-06-042@comp.compilers> References: <22-06-023@comp.compilers> <22-06-040@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="31532"; mail-complaints-to="abuse@iecc.com" Keywords: design Posted-Date: 12 Jun 2022 14:25:47 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3072 George Neuner writes: >Note that Yacc and Bison also recognize textual constants in the parser >grammar and generate a token id for the (seperately specified) lexer >to return. If you have a rule in yacc/bison: E: T '+' T the "token id" for '+' is the ASCII-code of +. Bison generates token ids only for tokens defined with %token. So if you instead write E: T PLUS T you have to define %token PLUS and the value of PLUS is communicated to the scanner through the .tab.h file. Also note that for the last version of yacc that I have seen documentation for, if you have a rule S: L ":=" E there is no token for ":=", but instead what you get is equivalent to S: L ':' '=' E Bison is more capable, you can, e.g., define %token BECOMES ":=" - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/