Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2021

Re: Regular expression string searching & matching

Path csiph.com!xmission!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From Clint O <clint.olsen@gmail.com>
Newsgroups comp.compilers
Subject Re: Regular expression string searching & matching
Date Thu, 22 Mar 2018 17:46:03 GMT
Organization Newshosting.com - Highest quality at a great price! www.newshosting.com
Lines 49
Sender news@iecc.com
Approved comp.compilers@iecc.com
Message-ID <18-03-089@comp.compilers> (permalink)
References <18-03-016@comp.compilers> <18-03-032@comp.compilers> <18-03-034@comp.compilers> <18-03-035@comp.compilers> <18-03-041@comp.compilers> <18-03-045@comp.compilers> <18-03-054@comp.compilers> <18-03-087@comp.compilers>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 8bit
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="95950"; mail-complaints-to="abuse@iecc.com"
Keywords lex, DFA
Posted-Date 22 Mar 2018 20:43:53 EDT
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:2021

Show key headers only | View raw


On 2018-03-20, Clint O <clint.olsen@gmail.com> wrote:
> [ reposted to try to make the special characters look right ]
>
> q0: /·[*]·([^*] | [*]+·[^/])*·[*]+·/
>     [/] q2
>     ['\x00'-.0-ÿ] q1
> q1: ∅
>     ['\x00'-ÿ] q1
> q2: [*]·([^*] | [*]+·[^/])*·[*]+·/
>     [*] q3
>     ['\x00'-)+-ÿ] q1
> q3: ([^*] | [*]+·[^/])*·[*]+·/
>     [*] q4
>     ['\x00'-)+-ÿ] q3
> q4: ([*]*·[^/]·([^*] | [*]+·[^/])*·[*]+ | [*]*)·/
>     [*] q6
>     ['\x00'-)+-.0-ÿ] q3
>     [/] q5
> q5: ε
>     ['\x00'-ÿ] q1
> q6: (([*]*·[^/] | ε)·([^*] | [*]+·[^/])*·[*]+ | [*]*)·/
>     [*] q8
>     ['\x00'-)+-.0-ÿ] q3
>     [/] q7
> q7: ([^*] | [*]+·[^/])*·[*]+·/ | ε
>     [*] q4
>     ['\x00'-)+-ÿ] q3
> q8: ((([*]*·[^/] | ε)·([^*] | [*]+·[^/])* | [*]*·[^/]·([^*] | [*]+·[^/])*)·[*]+ | [*]*)·/
>     [*] q8
>     ['\x00'-)+-.0-ÿ] q3
>     [/] q7

Thanks John for reposting this. It looks much better now.

In summary:

q5,q7 are accepting states since they contain epsilon. q2 represents the
error state.

The key to success with this algorithm is recognizing previously calculated
derivatives/expressions. When you no longer calculate unique derivatives,
the DFA construction terminates. As you can see the expressions can get
hairy pretty quickly. I don't know if you can glean much from the
successive expressions generated. It's akin to the method of walking the
parse tree but it bypasses the NFA construction entirely.

Thanks,

-Clint

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-04 01:37 -0800
  Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-07 11:53 -0800
    Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-07 12:18 -0800
    Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-08 22:53 -0800
      Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-10 00:57 -0800
        Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-11 13:52 -0700
          Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-12 14:00 -0700
            Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-13 11:30 -0700
              Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-17 16:52 -0700
              Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-18 19:23 +0000
              Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-20 17:23 +0000
                Re: Regular expression string searching & matching Clint O <clint.olsen@gmail.com> - 2018-03-22 17:46 +0000
          Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-12 15:46 -0700
          Re: Regular expression string searching & matching Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2018-03-13 10:53 +0100
            Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-13 14:23 -0700
      Re: Regular expression string searching & matching Ben Hanson <jamin.hanson@googlemail.com> - 2018-03-10 03:17 -0800

csiph-web