Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Clint O Newsgroups: comp.compilers Subject: Re: Regular expression string searching & matching Date: Mon, 12 Mar 2018 14:00:48 -0700 (PDT) Organization: Compilers Central Lines: 42 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <18-03-045@comp.compilers> References: <18-03-016@comp.compilers> <18-03-032@comp.compilers> <18-03-034@comp.compilers> <18-03-035@comp.compilers> <18-03-041@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="32721"; mail-complaints-to="abuse@iecc.com" Keywords: lex Posted-Date: 12 Mar 2018 21:36:19 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:1993 On Monday, March 12, 2018 at 1:19:29 PM UTC-7, Ben Hanson wrote: > > /This/ actually worked for me (one character change): > > > > [/][*]([^*]|[*]+[^/])*[*]+[/] > > Your modified regex produces the following state machine: > [snip] > > Which will match > > /***/a*/ > > in its entirety, when if should only match > > /***/ > > Regards, > > Ben > [Doesn't that depend on whether you interpret the END STATE in state 6 to stop even > if there's more input? -John] Interesting. I'm not seeing this behavior with the sample input you've provided. Again, I'm willing to concede that I have a bug :) What I'm doing is simulating the DFA until I get to an error state or I hit EOF. So, this guarantees I'll record the longest match I've found. I could post the states that I come up with, but my state dumper also prints out the RE it's currently processing (the actual expression). The successive computation of derivatives can sometimes produce some rather abhorrent output, and it's not always obvious (to me) what's going on. I'll work on a cleaner presentation and try to post this. It also looks like you are running a DFA minimizer (like Hopcroft) on your result since I am not producing a minimal DFA. That also may help me figure out if I'm producing the right automaton because they'd match... Thanks, -Clint