Groups > comp.lang.python > #45885 > unrolled thread

Re: Utility to locate errors in regular expressions

Started by	Devin Jeanpierre <jeanpierreda@gmail.com>
First post	2013-05-24 09:13 -0400
Last post	2013-05-24 09:40 -0400
Articles	2 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.

  Re: Utility to locate errors in regular expressions Devin Jeanpierre <jeanpierreda@gmail.com> - 2013-05-24 09:13 -0400
    Re: Utility to locate errors in regular expressions Roy Smith <roy@panix.com> - 2013-05-24 09:40 -0400

#45885 — Re: Utility to locate errors in regular expressions

From	Devin Jeanpierre <jeanpierreda@gmail.com>
Date	2013-05-24 09:13 -0400
Subject	Re: Utility to locate errors in regular expressions
Message-ID	<mailman.2065.1369401265.3114.python-list@python.org>

On Fri, May 24, 2013 at 8:58 AM, Malte Forkel <malte.forkel@berlin.de> wrote:
> As a first step, I am looking for a parser for Python regular
> expressions, or a Python regex grammar to create a parser from.

the sre_parse module is undocumented, but very usable.

> But may be my idea is flawed? Or a similar (or better) tools already
> exists? Any advice will be highly appreciated!

I think your task is made problematic by the possibility that no
single part of the regexp causes a match failure. What causes failure
depends on what branches are chosen with the |, *, +, ?, etc.
operators -- it might be a different character/subexpression for each
branch. And then there's exponentially many possible branches.

-- Devin

[toc] | [next] | [standalone]

#45888

From	Roy Smith <roy@panix.com>
Date	2013-05-24 09:40 -0400
Message-ID	<roy-FF225E.09401224052013@news.panix.com>
In reply to	#45885

In article <mailman.2065.1369401265.3114.python-list@python.org>,
 Devin Jeanpierre <jeanpierreda@gmail.com> wrote:

> On Fri, May 24, 2013 at 8:58 AM, Malte Forkel <malte.forkel@berlin.de> wrote:
> > As a first step, I am looking for a parser for Python regular
> > expressions, or a Python regex grammar to create a parser from.
> 
> the sre_parse module is undocumented, but very usable.
> 
> > But may be my idea is flawed? Or a similar (or better) tools already
> > exists? Any advice will be highly appreciated!
> 
> I think your task is made problematic by the possibility that no
> single part of the regexp causes a match failure. What causes failure
> depends on what branches are chosen with the |, *, +, ?, etc.
> operators -- it might be a different character/subexpression for each
> branch. And then there's exponentially many possible branches.

That's certainly true.  The full power of regex makes stuff like this 
very hard to do in the general case.  That being said, people tend to 
write regexen which match hunks of text from left to right.

So, in theory, it's probably an intractable problem.  But, in practice, 
such a tool would actually be useful in a large set of real-life cases.

[toc] | [prev] | [standalone]

csiph-web

Re: Utility to locate errors in regular expressions

Contents

#45885 — Re: Utility to locate errors in regular expressions

#45888