Groups > comp.lang.python > #101736 > unrolled thread

Powerful perl paradigm I don't find in python

Started by	"Charles T. Smith" <cts.private.yahoo@gmail.com>
First post	2016-01-15 09:24 +0000
Last post	2016-01-15 11:54 -0500
Articles	13 — 7 participants

Back to article view | Back to comp.lang.python

  Powerful perl paradigm I don't find in python "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-01-15 09:24 +0000
    Re: Powerful perl paradigm I don't find in python Peter Otten <__peter__@web.de> - 2016-01-15 10:43 +0100
      Re: Powerful perl paradigm I don't find in python Michael Vilain <vilain@NOspamcop.net> - 2016-01-15 02:20 -0800
    Re: Powerful perl paradigm I don't find in python Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-01-15 11:42 +0100
      Re: Powerful perl paradigm I don't find in python "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-01-15 11:04 +0000
        Re: Powerful perl paradigm I don't find in python "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-01-15 11:06 +0000
        Re: Powerful perl paradigm I don't find in python Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-01-15 14:20 +0100
          Re: Powerful perl paradigm I don't find in python "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-01-18 13:05 +0000
            Re: Powerful perl paradigm I don't find in python Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-01-18 14:33 +0100
        Re: Powerful perl paradigm I don't find in python Peter Otten <__peter__@web.de> - 2016-01-15 14:34 +0100
    Re: Powerful perl paradigm I don't find in python Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2016-01-15 13:51 +0000
      Re: Powerful perl paradigm I don't find in python me <self@example.org> - 2016-01-15 15:20 +0000
    Re: Powerful perl paradigm I don't find in python Nathan Hilterbrand <nhilterbrand@gmail.com> - 2016-01-15 11:54 -0500

#101736 — Powerful perl paradigm I don't find in python

From	"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date	2016-01-15 09:24 +0000
Subject	Powerful perl paradigm I don't find in python
Message-ID	<n7adse$k6$1@dont-email.me>

while ($str != $tail) {
    $str ~= s/^(head-pattern)//;
    use ($1);
}

[toc] | [next] | [standalone]

#101737

From	Peter Otten <__peter__@web.de>
Date	2016-01-15 10:43 +0100
Message-ID	<mailman.2.1452851041.15297.python-list@python.org>
In reply to	#101736

Charles T. Smith wrote:

> while ($str != $tail) {
>     $str ~= s/^(head-pattern)//;
>     use ($1);
> }

For those whose Perl's a little rusty: what does this do?
A self-contained example might also be useful...

[toc] | [prev] | [next] | [standalone]

#101739

From	Michael Vilain <vilain@NOspamcop.net>
Date	2016-01-15 02:20 -0800
Message-ID	<vilain-D8DAD1.02204215012016@news.individual.net>
In reply to	#101737

In article <mailman.2.1452851041.15297.python-list@python.org>,
 Peter Otten <__peter__@web.de> wrote:

> Charles T. Smith wrote:
> 
> > while ($str != $tail) {
> >     $str ~= s/^(head-pattern)//;
> >     use ($1);
> > }
> 
> For those whose Perl's a little rusty: what does this do?
> A self-contained example might also be useful...

It does a string substitution of a line that contains "head-pattern".  
That looks like an error since who would search for a line starting with 
"head-pattern"?  The () is meant to encapsulate a substitution pattern 
to be used in the replacement string but they aren't doing that.

use ($1) would include whatever module is passed as the first argument.

This fragment is incomplete and makes no sense.  The OP will have to 
explain what they want because this is complete crap.

-- 
DeeDee, don't press that button!  DeeDee!  NO!  Dee...
[I filter all Goggle Groups posts, so any reply may be automatically ignored]

[toc] | [prev] | [next] | [standalone]

#101741

From	Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de>
Date	2016-01-15 11:42 +0100
Message-ID	<mailman.5.1452854573.15297.python-list@python.org>
In reply to	#101736

On 15.01.2016 10:43, Peter Otten wrote:
> Charles T. Smith wrote:
>
>> while ($str != $tail) {
>>      $str ~= s/^(head-pattern)//;
>>      use ($1);
>> }
>
> For those whose Perl's a little rusty: what does this do?
> A self-contained example might also be useful...
>

Right, an explanation would certainly get you a lot more responses.

If I'm guessing correctly what the snippet is supposed to do (and, yes, 
my Perl definitely is rusty), isn't the Python equivalent of the regex 
part of your question fairly obvious if you're using the re module:

things = []
while some_str != tail:
     m = re.match(pattern_str, some_str)
     things.append(some_str[:m.end()])
     some_str = some_str[m.end():]

# do something with things

I have no idea why you'd want to *import* all the things parsed out of 
some_str, but for this part you may look at importlib.import_module.

P.S.: the while loop above never ends if tail is not in some_str, but I 
guess your Perl snippet has the same problem?

[toc] | [prev] | [next] | [standalone]

#101742

From	"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date	2016-01-15 11:04 +0000
Message-ID	<n7ajo0$k6v$1@dont-email.me>
In reply to	#101741

On Fri, 15 Jan 2016 11:42:24 +0100, Wolfgang Maier wrote:

> On 15.01.2016 10:43, Peter Otten wrote:
>> Charles T. Smith wrote:
>>
>>> while ($str != $tail) {
>>>      $str ~= s/^(head-pattern)//;
>>>      use ($1);
>>> }
>>
....
> 
> things = []
> while some_str != tail:
>      m = re.match(pattern_str, some_str)
>      things.append(some_str[:m.end()])
>      some_str = some_str[m.end():]
> 
> # do something with things

Okay, I guess it's not a lot more work to use the end() method to manually
cut out the found portion.

What the original snippet does is parse *and consume* a string - actually,
to avoid maintaining a cursor traverse the string.  The perl feature is that
substitute allows the found pattern to be replaced, but retains the group
after the expression is complete.

The end() method is actually such a cursor, but already set up for you
by the class, and then the slicing considerably simplifies its use.

The point is, it would have been easy for python to offer the same
capability, somehow, but that was apparently overlooked.  For example,
by storing string state in the match object and having a method without a
string parameter.

[toc] | [prev] | [next] | [standalone]

#101743

From	"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date	2016-01-15 11:06 +0000
Message-ID	<n7ajrd$k6v$2@dont-email.me>
In reply to	#101742

On Fri, 15 Jan 2016 11:04:32 +0000, Charles T. Smith wrote:

> capability, somehow, but that was apparently overlooked.  For example,
> by storing string state in the match object and having a *sub* method without
> a string parameter.

[toc] | [prev] | [next] | [standalone]

#101751

From	Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de>
Date	2016-01-15 14:20 +0100
Message-ID	<mailman.11.1452864029.15297.python-list@python.org>
In reply to	#101742

On 15.01.2016 12:04, Charles T. Smith wrote:
> On Fri, 15 Jan 2016 11:42:24 +0100, Wolfgang Maier wrote:
>
>> On 15.01.2016 10:43, Peter Otten wrote:
>>> Charles T. Smith wrote:
>>>
>>>> while ($str != $tail) {
>>>>       $str ~= s/^(head-pattern)//;
>>>>       use ($1);
>>>> }
>>>
> ....
>>
>> things = []
>> while some_str != tail:
>>       m = re.match(pattern_str, some_str)
>>       things.append(some_str[:m.end()])
>>       some_str = some_str[m.end():]
>>
>> # do something with things
>
>
> Okay, I guess it's not a lot more work to use the end() method to manually
> cut out the found portion.
>
> What the original snippet does is parse *and consume* a string - actually,
> to avoid maintaining a cursor traverse the string.  The perl feature is that
> substitute allows the found pattern to be replaced, but retains the group
> after the expression is complete.
>
> The end() method is actually such a cursor, but already set up for you
> by the class, and then the slicing considerably simplifies its use.
>

I see. If consuming the string is not essential for you, but just a 
handy trick to avoid the cursor, you may prefer this (most likely 
faster) solution:

pattern = pattern_str.compile()
try:
     matches = pattern.findall(some_str, endpos=some_str.index(tail))
except ValueError:
     # do something if tail is not found
     pass

Best,
Wolfgang

[toc] | [prev] | [next] | [standalone]

#101873

From	"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date	2016-01-18 13:05 +0000
Message-ID	<n7inve$5ut$1@dont-email.me>
In reply to	#101751

On Fri, 15 Jan 2016 14:20:17 +0100, Wolfgang Maier wrote:

> pattern = pattern_str.compile()
> try:
>      matches = pattern.findall(some_str, endpos=some_str.index(tail))
> except ValueError:
>      # do something if tail is not found
>      pass

Oh!  I think that's it!


matches = findall (pattern, string)
for file in matches:
    use (file)

Totally cool!  Thank you.

[toc] | [prev] | [next] | [standalone]

#101874

From	Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de>
Date	2016-01-18 14:33 +0100
Message-ID	<mailman.87.1453124045.15297.python-list@python.org>
In reply to	#101873

On 1/18/2016 14:05, Charles T. Smith wrote:
> On Fri, 15 Jan 2016 14:20:17 +0100, Wolfgang Maier wrote:
>
>> pattern = pattern_str.compile()
>> try:
>>       matches = pattern.findall(some_str, endpos=some_str.index(tail))
>> except ValueError:
>>       # do something if tail is not found
>>       pass
>
> Oh!  I think that's it!
>
>
> matches = findall (pattern, string)
> for file in matches:
>      use (file)
>
> Totally cool!  Thank you.
>

Great if it helps you. Just beware that this simplified version is not 
exactly equivalent to your initial perl snippet:

Generally, findall will find ALL occurrences of pattern, not just 
adjacent ones.

Since your perl example would never terminate if something non-matching 
is interleaved with pattern matches I figured you never expect that case.

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

[toc] | [prev] | [next] | [standalone]

#101752

From	Peter Otten <__peter__@web.de>
Date	2016-01-15 14:34 +0100
Message-ID	<mailman.12.1452864897.15297.python-list@python.org>
In reply to	#101742

Charles T. Smith wrote:

> What the original snippet does is parse *and consume* a string - actually,
> to avoid maintaining a cursor traverse the string.  The perl feature is
> that substitute allows the found pattern to be replaced, but retains the
> group after the expression is complete.

That is too technical for my taste. When is your "paradigm" more useful than 
a simple

re.finditer(), re.findall(), or re.split()

? 

>> things = []
>> while some_str != tail:
>>      m = re.match(pattern_str, some_str)
>>      things.append(some_str[:m.end()])
>>      some_str = some_str[m.end():]
 
If that were common (or even ever occured) I'd write a helper which avoids 
the brittle some_str != tail comparison and exposes the functionality in a 
for loop:

class MissingTailError(ValueError):
    pass


class UnparsedRestError(ValueError):
    pass


def shave_off(regex, text, tail=None):
    """
    >>> for s in shave_off(r"[a-z]+ \\d+\\s*",
    ...        "foo 12 bar 34 baz", tail="baz"):
    ...     s
    'foo 12 '
    'bar 34 '
    """
    if tail is not None:
        if text.endswith(tail):
            end = len(text) - len(tail)
        else:
            raise MissingTailError("%r does not end with %r" % (text, tail))
    else:
        end = len(text)

    start = 0
    r = re.compile(regex)
    while start != end:
        m = r.match(text, start, end)
        if m is None:
            raise UnparsedRestError(
                "%r does not match pattern %r"
                % (text[start:end], r.pattern))
        yield text[m.start():m.end()]
        start = m.end()

[toc] | [prev] | [next] | [standalone]

#101754

From	Ulli Horlacher <framstag@rus.uni-stuttgart.de>
Date	2016-01-15 13:51 +0000
Message-ID	<n7athj$593$1@news2.informatik.uni-stuttgart.de>
In reply to	#101736

Charles T. Smith <cts.private.yahoo@gmail.com> wrote:
> while ($str != $tail) {
>     $str ~= s/^(head-pattern)//;
>     use ($1);
> }

use() is illegal syntax in Perl.


-- 
Ullrich Horlacher              Server und Virtualisierung
Rechenzentrum IZUS/TIK         E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart         Tel:    ++49-711-68565868
Allmandring 30a                Fax:    ++49-711-682357
70550 Stuttgart (Germany)      WWW:    http://www.tik.uni-stuttgart.de/

[toc] | [prev] | [next] | [standalone]

#101757

From	me <self@example.org>
Date	2016-01-15 15:20 +0000
Message-ID	<n7b2o1$1ns2$1@gioia.aioe.org>
In reply to	#101754

On 2016-01-15, Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote:
> Charles T. Smith <cts.private.yahoo@gmail.com> wrote:
>> while ($str != $tail) {
>>     $str ~= s/^(head-pattern)//;
>>     use ($1);
>> }
>
> use() is illegal syntax in Perl.

Actually it is not. OP is defnitely thinking of `use` as a placeholder for
some general use of the value $1.

In fact, according to the documentation of perl,

    use Module LIST

is equivalent of 

    BEGIN{ require Module; Module->import(LIST); }

For the rusty perl users, the code in `BEGIN` blocks are executed "as soon
as possible", that is before the remaining part of the code, and in order
of definition.

The idea is that you want to import all modules before running the code.

[toc] | [prev] | [next] | [standalone]

#101766

From	Nathan Hilterbrand <nhilterbrand@gmail.com>
Date	2016-01-15 11:54 -0500
Message-ID	<mailman.23.1452876885.15297.python-list@python.org>
In reply to	#101736


On 01/15/2016 04:24 AM, Charles T. Smith wrote:
> while ($str != $tail) {
>      $str ~= s/^(head-pattern)//;
>      use ($1);
> }

IDK...  maybe the OP is looking for something like this? :

import re

def do_something(matchobj):
     print("I found {}".format(matchobj.group(0)))
     return ""

tail = "END"
str = "FeeFieFooFumEND"
pattern = r"F.."

while(str and str != tail):
     oldstr = str
     str = re.sub(pattern, do_something, str, 1)
     if str == oldstr:
         break


Though I would probably change the perl code, too:

while ($str and $str != $tail) {
     $str ~= s/^(head-pattern)//;
     if ($1) {
        do_something($1);
     } else {
        last;
     }
}

Otherwise there is too much risk of an infinite loop if the string is 
(1) empty, or (2) never ends up being equal to "tail"

Nathan

[toc] | [prev] | [standalone]

csiph-web

Powerful perl paradigm I don't find in python

Contents

#101736 — Powerful perl paradigm I don't find in python

#101737

#101739

#101741

#101742

#101743

#101751

#101873

#101874

#101752

#101754

#101757

#101766