Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #9752

Re: a little parsing challenge ☺

Message-ID <3038846.NWcSDcC0EJ@PointedEars.de> (permalink)
From Thomas 'PointedEars' Lahn <PointedEars@web.de>
Organization PointedEars Software (PES)
Date 2011-07-17 23:43 +0200
Subject Re: a little parsing challenge ☺
Newsgroups comp.lang.python
References <36037253-086b-4467-a1db-9492d3772e78@r5g2000prf.googlegroups.com> <mailman.1166.1310902484.1164.python-list@python.org> <2396442.Iyju66GlRV@PointedEars.de>
Followup-To comp.lang.python

Followups directed to: comp.lang.python

Show all headers | View raw


Thomas 'PointedEars' Lahn wrote:

> It is possible [to parse the parentheses language], with Perl-compatible
> Regular Expressions (PCRE), provided that you have enough memory, to use
> such an extended Regular Expression (not to be confused with EREs³)⁴:
> 
>   \((([^()]*|(?R))*)\)
> 
> However, even Python 3.2 does not support those expressions (although it
> supports some other PCRE patterns, like named subexpressions)⁵, neither do
> standard and forked versions of sed(1) (BREs, EREs, using an NFA) nor awk
> (EREs, using a DFA or NFA).  [That is not to say it would not be possible
> with Python, or sed or awk (both of which are off-topic here), but that
> more than a Regular Expression would be required.]

Supplemental: Further research shows that the Python `re' module is not 
going to implement (PCRE) recursive Regular Expressions.  The maintainer's, 
Matthew Barnett's, argument (of 2009-03-24) is that such things are better 
left to modules such as `pyparsing' [1][2].

However, FWIW, here is the Python port of the start of a language parser 
originally written in (and for) ECMAScript:

import re

def matchingBraces(s):
    level = 0

    for match in re.finditer(r'[{}]', s):
        paren = match.group(0)

        if paren == "{":
            level += 1
        else:
            if level == 0: return False
            level -= 1

    return level == 0

As you can see, the theoretically necessary PDA¹ implementation can be 
simplified to a braces counter with range checks by iterative use of a 
Regular Expression.  Extensions to meet the "challenge" are left as an 
exercise to the reader.

It has also occurred to me that because parentheses (`(', `)') and brackets 
(`[', `]') have special meaning in Regular Expressions (grouping and 
character classes, respectively), you could escape all other special 
characters in a text and use the RE evaluator itself to find out whether 
they are properly nested, having it evaluate one large RE.  But I have not 
tested this idea yet.  (Obviously it cannot be used to satisfy the 
"challenge"'s condition that braces – `{', `}' – and other parenthesis-like 
characters are to be considered as well, and that the parenthesis-like 
characters are to be printed.)

______
¹ Pushdown automaton

References:
[1] <http://bugs.python.org/issue694374>
[2] <http://pyparsing.wikispaces.com/>
-- 
PointedEars

Bitte keine Kopien per E-Mail. / Please do not Cc: me.

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-17 00:47 -0700
  Re: a little parsing challenge ☺ Raymond Hettinger <python@rcn.com> - 2011-07-17 02:48 -0700
    Re: a little parsing challenge ☺ Robert Klemme <shortcutter@googlemail.com> - 2011-07-17 15:20 +0200
      Re: a little parsing challenge ☺ mhenn <michihenn@hotmail.com> - 2011-07-17 15:55 +0200
        Re: a little parsing challenge ☺ Robert Klemme <shortcutter@googlemail.com> - 2011-07-17 18:01 +0200
          Re: a little parsing challenge ☺ Robert Klemme <shortcutter@googlemail.com> - 2011-07-17 18:54 +0200
    Re: a little parsing challenge ☺ Thomas Boell <tboell@domain.invalid> - 2011-07-17 17:49 +0200
      Re: a little parsing challenge ☺ Raymond Hettinger <python@rcn.com> - 2011-07-17 12:16 -0700
    Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-18 07:39 -0700
      Re: a little parsing challenge ☺ Robert Klemme <shortcutter@googlemail.com> - 2011-07-20 08:23 +0200
      Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-20 03:31 -0700
        Re: a little parsing challenge ☺ "Uri Guttman" <uri@StemSystems.com> - 2011-07-20 12:31 -0400
          Re: a little parsing challenge ☺ rusi <rustompmody@gmail.com> - 2011-07-20 10:30 -0700
          Re: a little parsing challenge ☺ merlyn@stonehenge.com (Randal L. Schwartz) - 2011-07-20 12:06 -0700
            Re: a little parsing challenge ☺ Jason Earl <jearl@notengoamigos.org> - 2011-07-20 14:57 -0600
    Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-19 09:54 -0700
      Re: a little parsing challenge ☺ Thomas Jollans <t@jollybox.de> - 2011-07-19 20:07 +0200
        Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-21 05:58 -0700
          Re: a little parsing challenge ☺ Ian Kelly <ian.g.kelly@gmail.com> - 2011-07-21 08:26 -0600
            Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-21 08:36 -0700
              Re: a little parsing challenge ☺ python@bdurham.com - 2011-07-21 12:43 -0400
                Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-21 11:53 -0700
                Re: a little parsing challenge ☺ Terry Reedy <tjreedy@udel.edu> - 2011-07-21 18:37 -0400
          Re: a little parsing challenge ☺ John O'Hagan <research@johnohagan.com> - 2011-07-25 15:57 +1000
      Re: a little parsing challenge ☺ Ian Kelly <ian.g.kelly@gmail.com> - 2011-07-19 12:08 -0600
  Re: a little parsing challenge ☺ Chris Angelico <rosuav@gmail.com> - 2011-07-17 21:34 +1000
    Re: a little parsing challenge ☺ rusi <rustompmody@gmail.com> - 2011-07-17 04:52 -0700
    Re: a little parsing challenge ☺ Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-07-17 16:15 +0200
      Re: a little parsing challenge ☺ Raymond Hettinger <python@rcn.com> - 2011-07-17 12:18 -0700
        Re: a little parsing challenge ☺ Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-07-17 22:16 +0200
          Re: a little parsing challenge ☺ Thomas Jollans <t@jollybox.de> - 2011-07-17 22:57 +0200
      Re: a little parsing challenge ☺ Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-07-17 23:43 +0200
      Re: a little parsing challenge ☺ Rouslan Korneychuk <rouslank@msn.com> - 2011-07-18 03:09 -0400
        Re: a little parsing challenge ☺ Stefan Behnel <stefan_ml@behnel.de> - 2011-07-18 09:24 +0200
          Re: a little parsing challenge ☺ Rouslan Korneychuk <rouslank@msn.com> - 2011-07-18 04:04 -0400
        Re: a little parsing challenge ☺ Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-07-18 18:46 +0200
          Re: a little parsing challenge ☺ Rouslan Korneychuk <rouslank@msn.com> - 2011-07-18 14:14 -0400
        Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-21 06:23 -0700
          Re: a little parsing challenge ☺ Rouslan Korneychuk <rouslank@msn.com> - 2011-07-21 17:54 -0400
  Re: a little parsing challenge ☺ gene heskett <gheskett@wdtv.com> - 2011-07-17 10:26 -0400
  Re: a little parsing challenge ☺ Thomas Jollans <t@jollybox.de> - 2011-07-17 08:31 -0700
    Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-19 10:49 -0700
      Re: a little parsing challenge ☺ Thomas Jollans <t@jollybox.de> - 2011-07-19 20:14 +0200
        Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-21 05:29 -0700
          Re: a little parsing challenge ☺ Thomas Jollans <t@jollybox.de> - 2011-07-21 15:21 +0200
      Re: a little parsing challenge ☺ Thomas Jollans <t@jollybox.de> - 2011-07-19 20:17 +0200
  Re: a little parsing challenge ☺ rantingrick <rantingrick@gmail.com> - 2011-07-17 18:52 -0700
  Re: a little parsing challenge ☺ Billy Mays <81282ed9a88799d21e77957df2d84bd6514d9af6@myhashismyemail.com> - 2011-07-18 13:12 -0400
    Re: a little parsing challenge ☺ Ian Kelly <ian.g.kelly@gmail.com> - 2011-07-18 12:10 -0600
      Re: a little parsing challenge ☺ Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-07-18 23:59 +0200
        Re: a little parsing challenge ☺ Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2011-07-19 08:09 +0200
        Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-19 10:32 -0700
    Re: a little parsing challenge ☺ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-19 09:56 +1000
      Re: a little parsing challenge ☺ Billy Mays <noway@nohow.com> - 2011-07-18 22:07 -0400
        Re: a little parsing challenge ☺ rusi <rustompmody@gmail.com> - 2011-07-18 19:50 -0700
          Re: a little parsing challenge ☺ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-19 13:11 +1000
            Re: a little parsing challenge ☺ rusi <rustompmody@gmail.com> - 2011-07-18 21:59 -0700
              Re: a little parsing challenge ☺ Chris Angelico <rosuav@gmail.com> - 2011-07-19 15:36 +1000
        Re: a little parsing challenge ☺ MRAB <python@mrabarnett.plus.com> - 2011-07-19 04:08 +0100
        Re: a little parsing challenge ☺ Benjamin Kaplan <benjamin.kaplan@case.edu> - 2011-07-18 20:54 -0700
        Re: a little parsing challenge ☺ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-19 14:30 +1000
        Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-19 01:58 -0700
    Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-19 10:14 -0700
      Re: a little parsing challenge ☺ Billy Mays <81282ed9a88799d21e77957df2d84bd6514d9af6@myhashismyemail.com> - 2011-07-19 13:33 -0400
        Re: a little parsing challenge ☺ Xah Lee <xahlee@gmail.com> - 2011-07-19 11:12 -0700
          Re: a little parsing challenge ☺ Terry Reedy <tjreedy@udel.edu> - 2011-07-19 15:09 -0400
            Re: a little parsing challenge ☺ jmfauth <wxjmfauth@gmail.com> - 2011-07-19 23:29 -0700
              Re: a little parsing challenge ☺ Ian Kelly <ian.g.kelly@gmail.com> - 2011-07-20 01:29 -0600
                Re: a little parsing challenge ☺ jmfauth <wxjmfauth@gmail.com> - 2011-07-20 00:54 -0700
                Re: a little parsing challenge ☺ Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2011-07-20 18:18 +1000
  Re: a little parsing challenge ? sln@netherlands.com - 2011-07-18 12:34 -0700
  Re: a little parsing challenge ☺ Mark Tarver <dr.mtarver@gmail.com> - 2011-07-19 22:43 -0700

csiph-web