Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.programming > #897

Re: Deterministic Finite State Machine for regexs.

Path csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.panservice.it!feeder.news-service.com!border3.nntp.ams.giganews.com!Xl.tags.giganews.com!border1.nntp.ams.giganews.com!nntp.giganews.com!local2.nntp.ams.giganews.com!nntp.bt.com!news.bt.com.POSTED!not-for-mail
NNTP-Posting-Date Sun, 02 Oct 2011 06:21:39 -0500
From "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org>
Newsgroups comp.programming
References <9I4hq.4$yO1.1@newsfe21.iad>
Subject Re: Deterministic Finite State Machine for regexs.
Date Sun, 2 Oct 2011 12:13:19 +0100
X-Priority 3
X-MSMail-Priority Normal
X-Newsreader Microsoft Outlook Express 6.00.2900.5512
X-MimeOLE Produced By Microsoft MimeOLE V6.00.2900.5512
X-RFC2646 Format=Flowed; Response
Message-ID <ytWdnZP-mL_e1BXTnZ2dnUVZ7vSdnZ2d@bt.com> (permalink)
Lines 34
X-Usenet-Provider http://www.giganews.com
X-AuthenticatedUsername NoAuthUser
X-Trace sv3-xpO17Jn4hTENTxWJxSAHS87F/A+6EEW6llIj6FWmhTB+c2hEU/1f1L/txY+IGHcTHFm+WcduY/iKDoc!Wpn8c/XKcuTVb8xelBh0rcE6qNrmoUcTdOcT/VoLtt36Xx23qo7S++fbn+bhu7Fi8pOGIUgikhg=
X-Complaints-To abuse@btinternet.com
X-DMCA-Complaints-To abuse@btinternet.com
X-Abuse-and-DMCA-Info Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info Otherwise we will be unable to process your complaint properly
X-Postfilter 1.3.40
X-Original-Bytes 2834
Xref x330-a1.tempe.blueboxinc.net comp.programming:897

Show key headers only | View raw


Daniel Pitts wrote:

[on the subject of irregular extensions to regular language parsing]
> I was wondering if anyone has any thoughts on how to implement that kind
> of thing? I'd really like to the internal representation of a word
> boundary to be equivalent to (last != \w && current == \w || last == \w
> && current != \w)
>
> On an related note, I'd like to know if anyone has thoughts about
> supporting back-references and/or group-capturing with such a system.
> Its not as important for my current use-cases, but if I can build such a
> generic tool, it will almost certainly be useful in the future.

If you haven't already found Ross Cox's pages on regular expressions, then you 
may find it helpful.  A fair bit of it is going over the NFA/DFA stuff which 
you already know, but there are historical notes, and thoughts on 
implementations of non-regular features, plus descriptions of how his own 
library (RE2) implements such things.  Start with:

    http://swtch.com/~rsc/regexp/regexp3.html

BTW, if have my own little library too, and have faced (without actually 
bothering to solve -- yet) similar problems.  The approach I want to take, but 
haven't yet planned in enough detail to see if it'll even work let alone be 
acceptable fast, is to allow nodes in the NFA to be annotated with more-or-less 
arbitrary code (to be executed by the automaton) and which would (in some 
magical way) be preserved during translation to the DFA form (presumably it 
would normally result in larger DFA, since reachable state-sets couldn't be 
joined unless the code segments were equivalent along all paths leading to 
those states).

    -- chris 

Back to comp.programming | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Deterministic Finite State Machine for regexs. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-09-29 13:39 -0700
  Re: Deterministic Finite State Machine for regexs. "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2011-10-02 12:13 +0100
    Re: Deterministic Finite State Machine for regexs. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-10-02 16:13 -0700
  Re: Deterministic Finite State Machine for regexs. Rui Maciel <rui.maciel@gmail.com> - 2011-10-03 11:55 +0100

csiph-web