Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2000

Re: Regular expression string searching & matching

From Ben Hanson <jamin.hanson@googlemail.com>
Newsgroups comp.compilers
Subject Re: Regular expression string searching & matching
Date 2018-03-13 11:30 -0700
Organization Compilers Central
Message-ID <18-03-054@comp.compilers> (permalink)
References (1 earlier) <18-03-032@comp.compilers> <18-03-034@comp.compilers> <18-03-035@comp.compilers> <18-03-041@comp.compilers> <18-03-045@comp.compilers>

Show all headers | View raw


> It also looks like you are running a DFA minimizer (like Hopcroft) on your
> result since I am not producing a minimal DFA. That also may help me figure
> out if I'm producing the right automaton because they'd match...

Actually, although I have a minimising routine I'm not applying it here.

I intersect every character class in the regex to produce non-overlapping sets
and then these are grouped in equivalence sets for each state in the DFA. This
is the part that I didn't find in the Dragon Book (or any other compiler book
I looked in) and it was a message on this group from Vern Paxon that tipped me
off: https://compilers.iecc.com/comparch/article/94-10-091

(I only paid attention to the equivalence classes bit by the way!)

I intersect the equivalence sets as I build the DFA and I have found with this
technique you kind of have to go out of your way to get a non minimal DFA by
default. I've imagined an even more rigorous approach where character classes
are intersected as they are added to the syntax tree up front, but I've not
got around to trying to implement it.

I'd be interested to see your DFA output once you've cleaned it up a bit.
Using derivatives is interesting too - I've seen it discussed but never tried
to get into that.

Regards,

Ben

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-04 01:37 -0800
  Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-07 11:53 -0800
    Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-07 12:18 -0800
    Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-08 22:53 -0800
      Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-10 00:57 -0800
        Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-11 13:52 -0700
          Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-12 14:00 -0700
            Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-13 11:30 -0700
              Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-17 16:52 -0700
              Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-18 19:23 +0000
              Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-20 17:23 +0000
                Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-22 17:46 +0000
          Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-12 15:46 -0700
          Re: Regular expression string searching & matching Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2018-03-13 10:53 +0100
            Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-13 14:23 -0700
      Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-10 03:17 -0800

csiph-web