Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Roger L Costello Newsgroups: comp.compilers Subject: Why does the lexer convert text integer lexemes to binary integers? I thought that lexers should be simple? Date: Thu, 14 Jul 2022 10:25:24 +0000 Organization: Compilers Central Lines: 31 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-07-011@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="30289"; mail-complaints-to="abuse@iecc.com" Keywords: lex, design, comment Posted-Date: 14 Jul 2022 11:10:53 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Content-Language: en-US Xref: csiph.com comp.compilers:3115 Hi Folks, A common example in books on Lex/Flex and Yacc/Bison is evaluating arithmetic expressions. When the lexer encounters an integer lexeme, it casts the lexeme to a binary integer and returns the value to the parser. The lexer contains a rule that looks something like this: {INTEGER} { yylval.intval = atoi(yytext); return NUMBER; } But, but, but, ... Countless times on this list I have been told: Keep the lexer simple! By converting the lexeme to an integer, the lexer has assumed that the parser needs/wants a binary integer, not a text number. How does the lexer know what the parser needs/wants? That seems like knowledge the lexer shouldn't have if the lexer is to be simple. Further, even if one parser needs/wants a binary integer value, that parser might be swapped out at a later date and replaced with a different parser that wants the text number. It seems to me that the lexer should return to the parser the text number and it is the responsibility of the parser to convert the value to an integer data type if it desires. What do you think? /Roger [I think the lexer should provide the tokens that the parser needs. If integers are always handled as numbers, convert them, if not, don't. If the parser does one and later changes to do the other, you can change the lexer, too. -John]