Path: csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Clint O Newsgroups: comp.compilers Subject: Re: Regular expression string searching & matching Date: Thu, 8 Mar 2018 22:53:37 -0800 (PST) Organization: Compilers Central Lines: 29 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <18-03-034@comp.compilers> References: <18-03-016@comp.compilers> <18-03-032@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit Injection-Date: Fri, 09 Mar 2018 06:53:37 +0000 Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="47875"; mail-complaints-to="abuse@iecc.com" Keywords: lex, DFA, comment Posted-Date: 09 Mar 2018 09:47:07 EST X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:1984 Hi Ben: Thanks for your post. I did try your regular expression (and a few small variations on it), but it exhibits the same behavior as the others I have tried. The difference with the complement version is that the accepting state I end up with has all transitions to the error state (which guarantees termination after match) where as these seem to still accept characters even after matching the closing '*/'. It's possible I have a bug in my implementation, so I'm still looking at it. Thanks, -Clint On Wednesday, March 7, 2018 at 11:59:10 AM UTC-8, Ben Hanson wrote: > [/][*]([^*]|[*]+[^*/])*[*]+[/] > > is what you are looking for. I ran into this when developing my lexer > generator library lexertl in C++. Having a debug::dump() function > really helped me grok what was going on. > > The trick of course is realising that you have to exclude the > characters that follow (i.e. the [^*/] part). That is the bit that > clobbers the greedy behaviour. I've had to remind myself of that on > more than one occasion recently! [This should work, it's a standard example in compiler texts. -John]