Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.029 X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'parser': 0.07; 'cc:addr :python-list': 0.11; 'python': 0.11; '24,': 0.16; 'appreciated!': 0.16; 'exists?': 0.16; 'expressions,': 0.16; 'failure.': 0.16; 'subject:expressions': 0.16; 'subject:regular': 0.16; 'wrote:': 0.18; 'module': 0.19; 'cc:addr:python.org': 0.22; '(or': 0.24; 'cc:2**0': 0.24; 'cc:no real name:2**0': 0.24; 'task': 0.26; 'header:In-Reply-To:1': 0.27; 'idea': 0.28; 'am,': 0.29; 'possibility': 0.29; 'message-id:@mail.gmail.com': 0.30; 'branches': 0.31; 'operators': 0.31; 'regular': 0.32; 'fri,': 0.33; 'advice': 0.35; 'received:209.85': 0.35; 'received:209.85.220': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'possible': 0.36; 'similar': 0.36; 'received:209': 0.37; 'depends': 0.38; 'skip:u 10': 0.60; 'first': 0.61; 'different': 0.65; 'regexp': 0.84; 'step,': 0.84; 'from.': 0.93; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=WHUftlDNR8wssXfLWDQRya4v1gZWFLWsPhPbdiv5+FM=; b=L8b4c1E2hodCWbLG/Pgw6rMc/hJcusVzJTBrEJOS/I5Une/Es1jmHo3FfFvSjtkBPJ +nBSEmonDmDqiX90H7nAUDILuTiFjuIwP2NwFTkjBKKkxjK5DCd8alaGas5hAc+pwSN1 pM8sc5M6kI8PdiFwEc25x+MNnq2vyt1+4/dX8mkuc2ywooNvbnasYEDiJubO+hhkOvOS 0WTO8Mct2Yi1sp8i6LSglvGOvLIQnsmsp6Q3HV04pJEccm5LRen3VlFmf3lnbsJtsYQn Jqmu+sWqOaowjoUJDOB+SbccEP1HrpOsR5P2SuSCRbtY+QnGn7eIhSSDl8USQoqtITR8 4qLA== X-Received: by 10.52.158.225 with SMTP id wx1mr7193663vdb.121.1369401262158; Fri, 24 May 2013 06:14:22 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: References: From: Devin Jeanpierre Date: Fri, 24 May 2013 09:13:41 -0400 Subject: Re: Utility to locate errors in regular expressions To: Malte Forkel Content-Type: text/plain; charset=UTF-8 Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 16 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1369401265 news.xs4all.nl 15909 [2001:888:2000:d::a6]:34685 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:45885 On Fri, May 24, 2013 at 8:58 AM, Malte Forkel wrote: > As a first step, I am looking for a parser for Python regular > expressions, or a Python regex grammar to create a parser from. the sre_parse module is undocumented, but very usable. > But may be my idea is flawed? Or a similar (or better) tools already > exists? Any advice will be highly appreciated! I think your task is made problematic by the possibility that no single part of the regexp causes a match failure. What causes failure depends on what branches are chosen with the |, *, +, ?, etc. operators -- it might be a different character/subexpression for each branch. And then there's exponentially many possible branches. -- Devin