Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
MIME-Version: 1.0
In-Reply-To: <knno5h$t9f$1@ger.gmane.org>
References: <knno5h$t9f$1@ger.gmane.org>
From: Devin Jeanpierre <jeanpierreda@gmail.com>
Date: Fri, 24 May 2013 09:13:41 -0400
Subject: Re: Utility to locate errors in regular expressions
To: Malte Forkel <malte.forkel@berlin.de>
Content-Type: text/plain; charset=UTF-8
Cc: python-list@python.org
Precedence: list
Newsgroups: comp.lang.python
Message-ID: <mailman.2065.1369401265.3114.python-list@python.org>
Lines: 16
NNTP-Posting-Host: 2001:888:2000:d::a6
Xref: csiph.com comp.lang.python:45885

On Fri, May 24, 2013 at 8:58 AM, Malte Forkel <malte.forkel@berlin.de> wrote:
> As a first step, I am looking for a parser for Python regular
> expressions, or a Python regex grammar to create a parser from.

the sre_parse module is undocumented, but very usable.

> But may be my idea is flawed? Or a similar (or better) tools already
> exists? Any advice will be highly appreciated!

I think your task is made problematic by the possibility that no
single part of the regexp causes a match failure. What causes failure
depends on what branches are chosen with the |, *, +, ?, etc.
operators -- it might be a different character/subexpression for each
branch. And then there's exponentially many possible branches.

-- Devin