Path: csiph.com!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: gah4 Newsgroups: comp.compilers Subject: Re: State-of-the-art algorithms for lexical analysis? Date: Mon, 6 Jun 2022 12:25:56 -0700 (PDT) Organization: Compilers Central Lines: 21 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <22-06-014@comp.compilers> References: <22-06-009@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="79244"; mail-complaints-to="abuse@iecc.com" Keywords: lex, performance, comment Posted-Date: 06 Jun 2022 16:05:09 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <22-06-009@comp.compilers> Xref: csiph.com comp.compilers:3052 On Monday, June 6, 2022 at 8:06:28 AM UTC-7, Roger L Costello wrote: (snip, I wrote) > > I suspect that if regexes hadn't previously > > been defined, we might come up with > > something different today. > Wow! That is a remarkable statement. Well, mostly, regex were defined based on what was reasonable to do on computers at the time. It seems reasonable, then, with the more powerful computers of today, to expect that more features would have been added. Some of that was done in the later ERE, Extended Regular Expression. But there is a strong tendency not to break backward compatibility, and so not add new features later. [See my note about DFAs a few messages back. EREs are just syntactic sugar on regular REs so sure. PCREs are swell but they are a lot slower since backreferences mean you need to be able to back up. -John]