Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Jan Ziak <0xe2.0x9a.0x9b@gmail.com> Newsgroups: comp.compilers Subject: Re: Why does the lexer convert text integer lexemes to binary integers? I thought that lexers should be simple? Date: Fri, 15 Jul 2022 03:02:07 -0700 (PDT) Organization: Compilers Central Lines: 25 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-07-020@comp.compilers> References: <22-07-011@comp.compilers> <22-07-015@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="68623"; mail-complaints-to="abuse@iecc.com" Keywords: lex, performance, comment Posted-Date: 15 Jul 2022 12:09:02 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-07-015@comp.compilers> Xref: csiph.com comp.compilers:3120 On Friday, July 15, 2022 at 4:13:42 AM UTC+2, George Neuner wrote: > In many (actually most) cases, the binary representation of an integer > can be stored in less space than the text representation. The output of a command such as (cd /usr/src/linux; grep --only-matching --recursive "\b[0-9][0-9]*\b") proves the falsity of the above claim. Binary, fixed-width, representation of integers is statistically more space-efficient compared to implicit-width textual representation only if the text representation of the integers includes (for example): a plain 32/64-bit pointer to the start of the text, a plain 16/32/64-bit relative/absolute offset to the start of the number in a character array, the [length of the textual form of the number] in an explicit form. Binary, fixed-width, representation of integers is more likely to be more space-efficient than [their textual representation with an implicit length] if the source of the integers doesn't originate from a human hand typing digits on a keyboard. For example, when the source of the integers is a high-precision measurement device, is something with a physical counterpart (such as: the GPS coordinates of objects spread across Earth's surface), etc. -atom [It's still hard to imagine that the size difference would matter in a compiler. If you're logging a million values a second and saving it to an archive, well, that's different. -John]