Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.programming > #897
| From | "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> |
|---|---|
| Newsgroups | comp.programming |
| References | <9I4hq.4$yO1.1@newsfe21.iad> |
| Subject | Re: Deterministic Finite State Machine for regexs. |
| Date | 2011-10-02 12:13 +0100 |
| Message-ID | <ytWdnZP-mL_e1BXTnZ2dnUVZ7vSdnZ2d@bt.com> (permalink) |
Daniel Pitts wrote:
[on the subject of irregular extensions to regular language parsing]
> I was wondering if anyone has any thoughts on how to implement that kind
> of thing? I'd really like to the internal representation of a word
> boundary to be equivalent to (last != \w && current == \w || last == \w
> && current != \w)
>
> On an related note, I'd like to know if anyone has thoughts about
> supporting back-references and/or group-capturing with such a system.
> Its not as important for my current use-cases, but if I can build such a
> generic tool, it will almost certainly be useful in the future.
If you haven't already found Ross Cox's pages on regular expressions, then you
may find it helpful. A fair bit of it is going over the NFA/DFA stuff which
you already know, but there are historical notes, and thoughts on
implementations of non-regular features, plus descriptions of how his own
library (RE2) implements such things. Start with:
http://swtch.com/~rsc/regexp/regexp3.html
BTW, if have my own little library too, and have faced (without actually
bothering to solve -- yet) similar problems. The approach I want to take, but
haven't yet planned in enough detail to see if it'll even work let alone be
acceptable fast, is to allow nodes in the NFA to be annotated with more-or-less
arbitrary code (to be executed by the automaton) and which would (in some
magical way) be preserved during translation to the DFA form (presumably it
would normally result in larger DFA, since reachable state-sets couldn't be
joined unless the code segments were equivalent along all paths leading to
those states).
-- chris
Back to comp.programming | Previous | Next — Previous in thread | Next in thread | Find similar
Deterministic Finite State Machine for regexs. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-09-29 13:39 -0700
Re: Deterministic Finite State Machine for regexs. "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2011-10-02 12:13 +0100
Re: Deterministic Finite State Machine for regexs. Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2011-10-02 16:13 -0700
Re: Deterministic Finite State Machine for regexs. Rui Maciel <rui.maciel@gmail.com> - 2011-10-03 11:55 +0100
csiph-web