Path: csiph.com!3.us.feeder.erje.net!feeder.erje.net!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Ben Hanson Newsgroups: comp.compilers Subject: Re: Regular expression string searching & matching Date: Wed, 7 Mar 2018 12:18:23 -0800 (PST) Organization: Compilers Central Lines: 14 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <18-03-033@comp.compilers> References: <18-03-016@comp.compilers> <18-03-032@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="47279"; mail-complaints-to="abuse@iecc.com" Keywords: DFA, lex Posted-Date: 09 Mar 2018 09:45:37 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:1983 I missed your question about non-greedy repeats. Yes, it is possible. See build_dfa() in generator.hpp from lexertl. Basically non-greedy transitions are snipped when building the dfa. I build a regex syntax tree as suggested in the Dragon Book and I keep track of greedy flags in the tree and that is passed down to partition/equivset.hpp and from there to the generator. The thing you have to careful about is respecting that the left side takes priority (i.e. the regex or sub-regex that came first). Regards, Ben