Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #3546

Re: Interesting paper on regex NFA matching

Path csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From cross@spitfire.i.gajendra.net
Newsgroups comp.compilers
Subject Re: Interesting paper on regex NFA matching
Date Fri, 26 Jan 2024 04:15:30 -0000
Organization Compilers Central
Sender johnl%iecc.com
Approved comp.compilers@iecc.com
Message-ID <24-01-004@comp.compilers> (permalink)
References <24-01-003@comp.compilers>
MIME-Version 1.0
Content-Type text/plain; charset="UTF-8"
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="99463"; mail-complaints-to="abuse@iecc.com"
Keywords lex, NFA, performance
Posted-Date 26 Jan 2024 10:17:58 EST
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:3546

Show key headers only | View raw


In article <24-01-003@comp.compilers>, John R Levine  <johnl@taugh.com> wrote:
>The easiest way to implement regular expressions is to turn them into
>NFAs, but that has well known problems that if you feed hostile input
>to the NFAs, they can take exponential time and it's a way to DDoS the
>server. If you translate the NFA to a DFA it runs in linear time, but
>if the regex is ambiguous, as many are, the DFA may match something
>different from the NFA.

NFAs won't take exponential time if implemented carefully,
though they may be superlinear.  Still, we're talking quadratic,
not exponential.

Backtracking implementations can take exponential time, but
those aren't all NFAs.

Converting an NFA to a DFA could be exponential, however.

https://swtch.com/~rsc/regexp/regexp1.html

>This paper on the Arxiv preprint server proposes some rather complex
>tweaks to NFA regex matching to fix many performance issues while giving
>the same answers as before.
>
>https://arxiv.org/abs/2401.12639

If I'm reading the abstract right, they modify a backtracking
implementation so that they can efficiently match some extended
regex's.  Note however, that such so-called "extended" regexs
are not really regular expressions in the formal sense; that is,
they do not describe members of the regular languages, but
rather languages beyond those expressible as a DFA.  Regardless
this work is interesting, since AFAIK the speedup for this
class of regex to linear time is novel.  Thanks for passing it
on.

	- Dan C.

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Interesting paper on regex NFA matching John R Levine <johnl@taugh.com> - 2024-01-25 22:10 -0500
  Re: Interesting paper on regex NFA matching cross@spitfire.i.gajendra.net - 2024-01-26 04:15 +0000
  Re: Interesting paper on regex NFA matching Kaz Kylheku <433-929-6894@kylheku.com> - 2024-01-26 04:16 +0000

csiph-web