Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: "matt.ti...@gmail.com" Newsgroups: comp.compilers Subject: Re: Why does the lexer convert text integer lexemes to binary integers? I thought that lexers should be simple? Date: Sat, 16 Jul 2022 05:32:58 -0700 (PDT) Organization: Compilers Central Lines: 26 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-07-028@comp.compilers> References: <22-07-011@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="14795"; mail-complaints-to="abuse@iecc.com" Keywords: lex, parse, design Posted-Date: 16 Jul 2022 13:09:10 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-07-011@comp.compilers> Xref: csiph.com comp.compilers:3125 On Thursday, 14 July 2022 at 11:10:56 UTC-4, Roger L Costello wrote: > [...] > It seems to me that the lexer should return to the parser the text number and > it is the responsibility of the parser to convert the value to an integer data > type if it desires. > > What do you think? The division of the job into lexing and parsing is *not* an important separation of concerns. Both of these are written at the same time, by the same person, and the requirements of the parser feed into the lexer in many detailed ways. Their specifications are highly coupled and, I expect, almost always written by the same person pretty much simultaneously. Instead, this division -- a great big line that divides one part of a context-free grammar from the other -- is an annoying practical concession that we make to improve performance in time and size (smaller tables, and optimization for text), and to get around the limitations in our tools (LR(1) is a lot less useful when the (1) is a character). This is specifically why all the answers here are giving you less than compelling practical justifications for parsing numbers in the lexer and nobody seems to mind. There really is no "should" and "should not" w.r.t. the division between lexing and parsing except what is practical.