Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: antispam@math.uni.wroc.pl Newsgroups: comp.compilers Subject: Re: Learning only one lexer made me blind to its hidden assumptions Date: Fri, 15 Jul 2022 20:14:10 -0000 (UTC) Organization: Aioe.org NNTP Server Lines: 46 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-07-027@comp.compilers> References: <22-07-006@comp.compilers> <22-07-010@comp.compilers> <22-07-017@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="93272"; mail-complaints-to="abuse@iecc.com" Keywords: lex, performance Posted-Date: 15 Jul 2022 16:36:32 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:3124 George Neuner wrote: > On Wed, 13 Jul 2022 19:52:45 -0000 (UTC), antispam@math.uni.wroc.pl > wrote: > > >Hmm, from flex manual: > > > >: -Ce, --ecs > >: construct equivalence classes > >: > >: -Cm, --meta-ecs > >: construct meta-equivalence classes > > > >If you want smaller tables use options above and flex DFA will > >work on character classes. > > But note that Flex /may/ run considerably slower if you make heavy use > of equivalence classes. IIRC, that results in (moral equivalent of) > NFA rather than DFA. IIUC equivalence classes are reasonably cheap: single extra array access per input character. Meta-equivalence are more expensive. If I wanted ultimate speed I probably would use hand-written scanner, they tend to be significantly faster than anything flex can generate. But in most cases flex scanner speed is adequate. > [On modern computers it's hard to imagine a scanner so big that the space > savings from those two options are worth it. 64K PDP-11 and all that. -John] Well, L1 cache on many processors is just 16K. L2 caches are bigger, but it is not hard to imagince scanner which wastes a lot of time due to L2 misses. If smaller tables can fit in L2, than there may be net speed gain. Even if there is speed loss on some machines smaller tables are likely to lead to more predictable/uniform performance. Also, if scanner if fast enough with smaller tables, then using bigger tables is just waste. Of course it does not make sense to spent a lot of effort eliminating small waste, but with flex effort is just an extra command line switch to flex. -- Waldek Hebisch