Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #105098 > unrolled thread

sobering observation, python vs. perl

Started by"Charles T. Smith" <cts.private.yahoo@gmail.com>
First post2016-03-17 15:29 +0000
Last post2016-03-18 22:47 +1100
Articles 20 on this page of 43 — 14 participants

Back to article view | Back to comp.lang.python


Contents

  sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 15:29 +0000
    Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 15:40 +0000
      Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 17:48 +0200
        Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 15:59 +0000
          Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 18:07 +0200
            Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:15 +0000
    Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 17:47 +0200
      Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:06 +0000
        Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 18:30 +0200
          Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:32 +0000
    Re: sobering observation, python vs. perl srinivas devaki <mr.eightnoteight@gmail.com> - 2016-03-17 21:18 +0530
      Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:15 +0000
    Re: sobering observation, python vs. perl Tim Chase <python.list@tim.thechases.com> - 2016-03-17 10:52 -0500
      Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:08 +0000
        Re: sobering observation, python vs. perl Ethan Furman <ethan@stoneleaf.us> - 2016-03-17 09:21 -0700
          Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:36 +0000
            Re: sobering observation, python vs. perl Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-17 17:09 +0000
            Re: sobering observation, python vs. perl Ethan Furman <ethan@stoneleaf.us> - 2016-03-17 10:26 -0700
              Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 17:35 +0000
                Re: sobering observation, python vs. perl Ethan Furman <ethan@stoneleaf.us> - 2016-03-17 11:21 -0700
            DSLs in perl and python (Was sobering observation) Rustom Mody <rustompmody@gmail.com> - 2016-03-17 10:47 -0700
              Re: DSLs in perl and python (Was sobering observation) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-17 22:22 +0000
              Re: DSLs in perl and python (Was sobering observation) MRAB <python@mrabarnett.plus.com> - 2016-03-17 22:43 +0000
                Re: DSLs in perl and python (Was sobering observation) Rustom Mody <rustompmody@gmail.com> - 2016-03-18 05:57 -0700
                  Re: DSLs in perl and python (Was sobering observation) Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-18 15:18 +0200
                  Re: DSLs in perl and python (Was sobering observation) Peter Otten <__peter__@web.de> - 2016-03-18 14:22 +0100
                    Re: DSLs in perl and python (Was sobering observation) Rustom Mody <rustompmody@gmail.com> - 2016-03-18 19:07 -0700
                      DSL design (was DSLs in perl and python) Rustom Mody <rustompmody@gmail.com> - 2016-03-29 06:28 -0700
                        Re: DSL design (was DSLs in perl and python) Chris Angelico <rosuav@gmail.com> - 2016-03-30 00:41 +1100
                        Re: DSL design (was DSLs in perl and python) Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-03-29 16:45 +0300
                        Finding methods, was Re: DSL design (was DSLs in perl and python) Peter Otten <__peter__@web.de> - 2016-03-29 15:51 +0200
        Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 18:34 +0200
          Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 16:42 +0000
            Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 19:08 +0200
              Re: sobering observation, python vs. perl "Charles T. Smith" <cts.private.yahoo@gmail.com> - 2016-03-17 17:25 +0000
                Re: sobering observation, python vs. perl BartC <bc@freeuk.com> - 2016-03-17 17:53 +0000
                  Re: sobering observation, python vs. perl Rustom Mody <rustompmody@gmail.com> - 2016-03-17 10:59 -0700
                  Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 20:53 +0200
                    Re: sobering observation, python vs. perl BartC <bc@freeuk.com> - 2016-03-17 19:06 +0000
                      Re: sobering observation, python vs. perl Marko Rauhamaa <marko@pacujo.net> - 2016-03-17 21:11 +0200
              Re: sobering observation, python vs. perl Ben Bacarisse <ben.usenet@bsb.me.uk> - 2016-03-17 20:47 +0000
        Re: sobering observation, python vs. perl Peter Otten <__peter__@web.de> - 2016-03-18 10:26 +0100
        Re: sobering observation, python vs. perl Steven D'Aprano <steve@pearwood.info> - 2016-03-18 22:47 +1100

Page 1 of 3  [1] 2 3  Next page →


#105098 — sobering observation, python vs. perl

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 15:29 +0000
Subjectsobering observation, python vs. perl
Message-ID<nceihb$vpg$1@dont-email.me>
I've really learned to love working with python, but it's too soon
to pack perl away.  I was amazed at how long a simple file search took
so I ran some statistics:

    $ time python find-rel.py
    ./find-relreq *.out | sort -u
    TestCase_F_00_P
    TestCase_F_00_S
    TestCase_F_01_S
    TestCase_F_02_M

    real    1m4.581s
    user    1m4.412s
    sys     0m0.140s


    $ time python find-rel.py
    # modified to use precompiled REs:
    TestCase_F_00_P
    TestCase_F_00_S
    TestCase_F_01_S
    TestCase_F_02_M

    real    0m29.337s
    user    0m29.174s
    sys     0m0.100s


    $ time perl find-rel.pl
    find-relreq.pl *.out | sort -u
    TestCase_F_00_P
    TestCase_F_00_S
    TestCase_F_01_S
    TestCase_F_02_M

    real    0m5.009s
    user    0m4.932s
    sys     0m0.072s

Here's the programs:

#!/usr/bin/env python
# vim: tw=0
import sys
import re

isready = re.compile ("(.*) is ready")
relreq = re.compile (".*release_req")
for fn in sys.argv[1:]:                                 # logfile name
    tn = None
    with open (fn) as fd:
        for line in fd:
            #match = re.match ("(.*) is ready", line)
            match = isready.match (line)
            if match:
                tn = match.group(1)
            #match = re.match (".*release_req", line)
            match = relreq.match (line)
            if match:
                #print "%s: %s" % (tn, line),
                print tn

vs.

while (<>) {
    if (/(.*) is ready/) {
        $tn = $1;
    }
    elsif (/release_req/) {
        print "$tn\n";
    }
}

Look at those numbers:
1 minute for python without precompiled REs
1/2 minute with precompiled REs
5 seconds with perl.

[toc] | [next] | [standalone]


#105099

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 15:40 +0000
Message-ID<ncej5h$vpg$2@dont-email.me>
In reply to#105098
On Thu, 17 Mar 2016 15:29:47 +0000, Charles T. Smith wrote:

And for completeness, and also surprising:

time sed -n -e '/ is ready/{s///;h}' -e '/release_req/{g;p}'  *.out | sort -u
TestCase_F_00_P
TestCase_F_00_S
TestCase_F_01_S
TestCase_F_02_M

real    0m10.998s
user    0m10.885s
sys     0m0.108s

Twice as long as perl...  I guess there's no excuse for sed anymore...

[toc] | [prev] | [next] | [standalone]


#105101

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-03-17 17:48 +0200
Message-ID<878u1hrtp5.fsf@elektro.pacujo.net>
In reply to#105099
"Charles T. Smith" <cts.private.yahoo@gmail.com>:

> On Thu, 17 Mar 2016 15:29:47 +0000, Charles T. Smith wrote:
>
> And for completeness, and also surprising:
>
> time sed -n -e '/ is ready/{s///;h}' -e '/release_req/{g;p}'  *.out | sort -u
> TestCase_F_00_P
> TestCase_F_00_S
> TestCase_F_01_S
> TestCase_F_02_M
>
> real    0m10.998s
> user    0m10.885s
> sys     0m0.108s
>
> Twice as long as perl...  I guess there's no excuse for sed anymore...

Try running the sed command again after setting:

export LANG=C


Marko

[toc] | [prev] | [next] | [standalone]


#105104

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 15:59 +0000
Message-ID<ncek92$vpg$4@dont-email.me>
In reply to#105101
On Thu, 17 Mar 2016 17:48:54 +0200, Marko Rauhamaa wrote:

> "Charles T. Smith" <cts.private.yahoo@gmail.com>:
> 
>> On Thu, 17 Mar 2016 15:29:47 +0000, Charles T. Smith wrote:
>>
>> And for completeness, and also surprising:
>>
>> time sed -n -e '/ is ready/{s///;h}' -e '/release_req/{g;p}'  *.out | sort -u
>> TestCase_F_00_P
>> TestCase_F_00_S
>> TestCase_F_01_S
>> TestCase_F_02_M
>>
>> real    0m10.998s
>> user    0m10.885s
>> sys     0m0.108s
>>
>> Twice as long as perl...  I guess there's no excuse for sed anymore...
> 
> Try running the sed command again after setting:
> 
> export LANG=C
> 
> 
> Marko

Hmmm.  Interesting thought.  But...

$ locale
LANG=C
LANGUAGE=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE=C
LC_MONETARY="C"
LC_MESSAGES="C"
LC_PAPER="C"
LC_NAME="C"
LC_ADDRESS="C"
LC_TELEPHONE="C"
LC_MEASUREMENT="C"
LC_IDENTIFICATION="C"
LC_ALL=

[toc] | [prev] | [next] | [standalone]


#105106

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-03-17 18:07 +0200
Message-ID<874mc5rsun.fsf@elektro.pacujo.net>
In reply to#105104
"Charles T. Smith" <cts.private.yahoo@gmail.com>:

> On Thu, 17 Mar 2016 17:48:54 +0200, Marko Rauhamaa wrote:
>> Try running the sed command again after setting:
>> 
>> export LANG=C
>
> Hmmm.  Interesting thought.  But...
>
> $ locale
> LANG=C

Ok. The LANG=C setting has a tremendous effect on the performance of
textutils.


Marko

[toc] | [prev] | [next] | [standalone]


#105110

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 16:15 +0000
Message-ID<ncel7q$vpg$8@dont-email.me>
In reply to#105106
On Thu, 17 Mar 2016 18:07:12 +0200, Marko Rauhamaa wrote:

> "Charles T. Smith" <cts.private.yahoo@gmail.com>:
> Ok. The LANG=C setting has a tremendous effect on the performance of
> textutils.
> 
> 
> Marko

Good to know, thank you...

[toc] | [prev] | [next] | [standalone]


#105100

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-03-17 17:47 +0200
Message-ID<87d1qtrtqs.fsf@elektro.pacujo.net>
In reply to#105098
"Charles T. Smith" <cts.private.yahoo@gmail.com>:

> Here's the programs:
>
> #!/usr/bin/env python
> # vim: tw=0
> import sys
> import re
>
> isready = re.compile ("(.*) is ready")
> relreq = re.compile (".*release_req")
> for fn in sys.argv[1:]:                                 # logfile name
>     tn = None
>     with open (fn) as fd:
>         for line in fd:
>             #match = re.match ("(.*) is ready", line)
>             match = isready.match (line)
>             if match:
>                 tn = match.group(1)
>             #match = re.match (".*release_req", line)
>             match = relreq.match (line)
>             if match:
>                 #print "%s: %s" % (tn, line),
>                 print tn
>
> vs.
>
> while (<>) {
>     if (/(.*) is ready/) {
>         $tn = $1;
>     }
>     elsif (/release_req/) {
>         print "$tn\n";
>     }
> }
>
> Look at those numbers:
> 1 minute for python without precompiled REs
> 1/2 minute with precompiled REs
> 5 seconds with perl.

Can't comment on the numbers but the code segments are not quite
analogous. What about this one:

    #!/usr/bin/env python
    # vim: tw=0
    import sys
    import re

    isready = re.compile("(.*) is ready")
    for fn in sys.argv[1:]:
        tn = None
        with open(fn) as fd:
            for line in fd:
                match = isready.match(line)
                if match:
                    tn = match.group(1)
                elif "release_req" in line:
                    print tn


Marko

[toc] | [prev] | [next] | [standalone]


#105107

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 16:06 +0000
Message-ID<nceklg$vpg$5@dont-email.me>
In reply to#105100
On Thu, 17 Mar 2016 17:47:55 +0200, Marko Rauhamaa wrote:


> Can't comment on the numbers but the code segments are not quite
> analogous. What about this one:
> 
>     #!/usr/bin/env python
>     # vim: tw=0
>     import sys
>     import re
> 
>     isready = re.compile("(.*) is ready")
>     for fn in sys.argv[1:]:
>         tn = None
>         with open(fn) as fd:
>             for line in fd:
>                 match = isready.match(line)
>                 if match:
>                     tn = match.group(1)
>                 elif "release_req" in line:
>                     print tn
> 
> 
> Marko


I need the second check to also be a RE because it's not
separate tokens.  How about this change:

            match = isready.match (line)
            if match:
                tn = match.group(1)
     >          continue

            match = relreq.match (line)
            if match:
                print tn

real    0m28.737s
user    0m28.538s
sys     0m0.128s

Shaved 2 seconds off.

[toc] | [prev] | [next] | [standalone]


#105113

FromMarko Rauhamaa <marko@pacujo.net>
Date2016-03-17 18:30 +0200
Message-ID<87y49hqd7e.fsf@elektro.pacujo.net>
In reply to#105107
"Charles T. Smith" <cts.private.yahoo@gmail.com>:

> I need the second check to also be a RE because it's not
> separate tokens.

The string "in" check doesn't care about tokens.


Marko

[toc] | [prev] | [next] | [standalone]


#105116

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 16:32 +0000
Message-ID<ncem69$eav$2@dont-email.me>
In reply to#105113
On Thu, 17 Mar 2016 18:30:29 +0200, Marko Rauhamaa wrote:

> "Charles T. Smith" <cts.private.yahoo@gmail.com>:
> 
>> I need the second check to also be a RE because it's not
>> separate tokens.
> 
> The string "in" check doesn't care about tokens.
> 
> 
> Marko


Ah, yes.  Okay.

[toc] | [prev] | [next] | [standalone]


#105102

Fromsrinivas devaki <mr.eightnoteight@gmail.com>
Date2016-03-17 21:18 +0530
Message-ID<mailman.273.1458229771.12893.python-list@python.org>
In reply to#105098
please upload the log file,

and global variables in python are slow, so just keep all that in a
function and try again. generally i get 20-30% time improvement by
doin that.


On Thu, Mar 17, 2016 at 8:59 PM, Charles T. Smith
<cts.private.yahoo@gmail.com> wrote:
> I've really learned to love working with python, but it's too soon
> to pack perl away.  I was amazed at how long a simple file search took
> so I ran some statistics:
>
>     $ time python find-rel.py
>     ./find-relreq *.out | sort -u
>     TestCase_F_00_P
>     TestCase_F_00_S
>     TestCase_F_01_S
>     TestCase_F_02_M
>
>     real    1m4.581s
>     user    1m4.412s
>     sys     0m0.140s
>
>
>     $ time python find-rel.py
>     # modified to use precompiled REs:
>     TestCase_F_00_P
>     TestCase_F_00_S
>     TestCase_F_01_S
>     TestCase_F_02_M
>
>     real    0m29.337s
>     user    0m29.174s
>     sys     0m0.100s
>
>
>     $ time perl find-rel.pl
>     find-relreq.pl *.out | sort -u
>     TestCase_F_00_P
>     TestCase_F_00_S
>     TestCase_F_01_S
>     TestCase_F_02_M
>
>     real    0m5.009s
>     user    0m4.932s
>     sys     0m0.072s
>
> Here's the programs:
>
> #!/usr/bin/env python
> # vim: tw=0
> import sys
> import re
>
> isready = re.compile ("(.*) is ready")
> relreq = re.compile (".*release_req")
> for fn in sys.argv[1:]:                                 # logfile name
>     tn = None
>     with open (fn) as fd:
>         for line in fd:
>             #match = re.match ("(.*) is ready", line)
>             match = isready.match (line)
>             if match:
>                 tn = match.group(1)
>             #match = re.match (".*release_req", line)
>             match = relreq.match (line)
>             if match:
>                 #print "%s: %s" % (tn, line),
>                 print tn
>
> vs.
>
> while (<>) {
>     if (/(.*) is ready/) {
>         $tn = $1;
>     }
>     elsif (/release_req/) {
>         print "$tn\n";
>     }
> }
>
> Look at those numbers:
> 1 minute for python without precompiled REs
> 1/2 minute with precompiled REs
> 5 seconds with perl.
> --
> https://mail.python.org/mailman/listinfo/python-list



-- 
Regards
Srinivas Devaki
Junior (3rd yr) student at Indian School of Mines,(IIT Dhanbad)
Computer Science and Engineering Department
ph: +91 9491 383 249
telegram_id: @eightnoteight

[toc] | [prev] | [next] | [standalone]


#105109

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 16:15 +0000
Message-ID<ncel67$vpg$7@dont-email.me>
In reply to#105102
On Thu, 17 Mar 2016 21:18:43 +0530, srinivas devaki wrote:

> please upload the log file,


Sorry, it's work stuff, can't do that, but just take any big set of files
and change the strings appropriately and the numbers should be equivalent.


> 
> and global variables in python are slow, so just keep all that in a
> function and try again. generally i get 20-30% time improvement by
> doin that.

#!/usr/bin/env python
# vim: tw=0
import sys
import re

def faster ():
    isready = re.compile ("(.*) is ready")
    relreq = re.compile (".*release_req")
    for fn in sys.argv[1:]:                                 # logfile name
        tn = None
        with open (fn) as fd:
            for line in fd:
                #match = re.match ("(.*) is ready", line)
                match = isready.match (line)
                if match:
                    tn = match.group(1)
                    continue
                #match = re.match (".*release_req", line)
                match = relreq.match (line)
                if match:
                    #print "%s: %s" % (tn, line),
                    print tn

faster()

$ time python ./find-relreq *.out | sort -u
TestCase_F_00_P
TestCase_F_00_S
TestCase_F_01_S
TestCase_F_02_M

real    0m25.515s
user    0m25.294s
sys     0m0.136s

3 more seconds!

[toc] | [prev] | [next] | [standalone]


#105103

FromTim Chase <python.list@tim.thechases.com>
Date2016-03-17 10:52 -0500
Message-ID<mailman.274.1458230151.12893.python-list@python.org>
In reply to#105098
On 2016-03-17 15:29, Charles T. Smith wrote:
> isready = re.compile ("(.*) is ready")
> relreq = re.compile (".*release_req")
> for fn in sys.argv[1:]:                                 # logfile
> name tn = None
>     with open (fn) as fd:
>         for line in fd:
>             #match = re.match ("(.*) is ready", line)
>             match = isready.match (line)
>             if match:
>                 tn = match.group(1)
>             #match = re.match (".*release_req", line)
>             match = relreq.match (line)
>             if match:

Note that this "match" and "if" get executed for every line

>                 #print "%s: %s" % (tn, line),
>                 print tn
> 
> vs.
> 
> while (<>) {
>     if (/(.*) is ready/) {
>         $tn = $1;
>     }
>     elsif (/release_req/) {

Note this else ^

>         print "$tn\n";
>     }
> }

Also, you might just test for string-presence on that second one

So what happens if your code looks something like

isready = re.compile ("(.*) is ready")
for fn in sys.argv[1:]: # logfile name
    tn = None
    with open (fn) as fd:
        for line in fd:
            match = isready.match (line)
            if match:
                tn = match.group(1)
            elif "release_req" in line:
                print tn

Not saying this will make a great deal of difference, but these two
items jumped out at me.  I'd even be tempted to just use string
manipulations for the isready aspect as well.  Something like
(untested)

IS_READY = " is ready"
REL_REQ = "release_req"
for n in sys.argv[1:]:
  tn = None
  with open(fn) as fd):
    for line in fd:
      try:
        index = line.rindex(IS_READY)
      except ValueError:
        if REL_REQ in line:
          print tn
      else:
        tn = line[:index]

-tkc


[toc] | [prev] | [next] | [standalone]


#105108

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 16:08 +0000
Message-ID<ncekqc$vpg$6@dont-email.me>
In reply to#105103
On Thu, 17 Mar 2016 10:52:30 -0500, Tim Chase wrote:

>> Not saying this will make a great deal of difference, but these two
> items jumped out at me.  I'd even be tempted to just use string
> manipulations for the isready aspect as well.  Something like
> (untested)

well, I don't want to forgo REs in order to have python's numbers be better....

[toc] | [prev] | [next] | [standalone]


#105112

FromEthan Furman <ethan@stoneleaf.us>
Date2016-03-17 09:21 -0700
Message-ID<mailman.276.1458231695.12893.python-list@python.org>
In reply to#105108
On 03/17/2016 09:08 AM, Charles T. Smith wrote:
> On Thu, 17 Mar 2016 10:52:30 -0500, Tim Chase wrote:

>> Not saying this will make a great deal of difference, but these two
>> items jumped out at me.  I'd even be tempted to just use string
>> manipulations for the isready aspect as well.  Something like
>> (untested)
>
> well, I don't want to forgo REs in order to have python's numbers be better....

The issue is not avoiding REs, but using Python's strengths and idioms. 
  Write the code in Python's style, get the same results, then compare 
the times.

If you posted the data file and exact results the rest of us could try, 
but as it is all we can do is offer ideas and you have test them.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#105117

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 16:36 +0000
Message-ID<ncemdm$eav$3@dont-email.me>
In reply to#105112
On Thu, 17 Mar 2016 09:21:51 -0700, Ethan Furman wrote:

>> well, I don't want to forgo REs in order to have python's numbers be better....
> 
> The issue is not avoiding REs, but using Python's strengths and idioms. 
>   Write the code in Python's style, get the same results, then compare 
> the times.


Yes, your point was to forge REs despite that they are useful.
I could have thought the search would have been better as:

    'release[-.:][Rr]eq'

or something else ... you're in a "defend python at all costs!" mode.

[toc] | [prev] | [next] | [standalone]


#105120

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2016-03-17 17:09 +0000
Message-ID<mailman.278.1458234612.12893.python-list@python.org>
In reply to#105117
On 17/03/2016 16:36, Charles T. Smith wrote:
> On Thu, 17 Mar 2016 09:21:51 -0700, Ethan Furman wrote:
>
>>> well, I don't want to forgo REs in order to have python's numbers be better....
>>
>> The issue is not avoiding REs, but using Python's strengths and idioms.
>>    Write the code in Python's style, get the same results, then compare
>> the times.
>
>
> Yes, your point was to forge REs despite that they are useful.
> I could have thought the search would have been better as:
>
>      'release[-.:][Rr]eq'
>
> or something else ... you're in a "defend python at all costs!" mode.
>

I believe it is more along the lines of "In Rome, do as the Romans".

-- 
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.

Mark Lawrence

[toc] | [prev] | [next] | [standalone]


#105124

FromEthan Furman <ethan@stoneleaf.us>
Date2016-03-17 10:26 -0700
Message-ID<mailman.280.1458235557.12893.python-list@python.org>
In reply to#105117
On 03/17/2016 09:36 AM, Charles T. Smith wrote:

> Yes, your point was to forgo REs despite that they are useful.
> I could have thought the search would have been better as:
>
>      'release[-.:][Rr]eq'
>
> or something else ... you're in a "defend python at all costs!" mode.

No, I'm in the "don't try to write <language X> in Python" mode, and 
"don't use 10lb sledge when 6oz hammer will do" mode:

--------------------------------------------------------
# using `in` and printing line as each is found
real	0m1.703s
user	0m0.184s
sys	0m0.260s

# using `in` and printing lines at the end
real	0m0.217s
user	0m0.112s
sys	0m0.068s

# using 're' and printing lines at the end
real	0m0.608s
user	0m0.516s
sys	0m0.060s
--------------------------------------------------------

As you can see, how you print has a huge impact.  Hopefully you also 
noticed that using `re` when `in` would do made the script 3 times slower.

--------------------------------------------------------
# using `in` code
import sys
found = []
for fn in sys.argv[1:]:
    with open(fn) as fh:
       for line in fh:
          if 'timezone' in line:
             found.append(line)
print ''.join(found)
--------------------------------------------------------
# using `re` code
import sys
import re
found = []
for fn in sys.argv[1:]:
    with open(fn) as fh:
       for line in fh:
          if re.search('timezone', line):
             found.append(line)
print ''.join(found)
--------------------------------------------------------

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


#105126

From"Charles T. Smith" <cts.private.yahoo@gmail.com>
Date2016-03-17 17:35 +0000
Message-ID<nceps6$ore$3@dont-email.me>
In reply to#105124
On Thu, 17 Mar 2016 10:26:12 -0700, Ethan Furman wrote:

> On 03/17/2016 09:36 AM, Charles T. Smith wrote:
> 
>> Yes, your point was to forgo REs despite that they are useful.
>> I could have thought the search would have been better as:
>>
>>      'release[-.:][Rr]eq'
>>
>> or something else ... you're in a "defend python at all costs!" mode.
> 
> No, I'm in the "don't try to write <language X> in Python" mode, and 
> "don't use 10lb sledge when 6oz hammer will do" mode:


Yes, fine.

I'd only like to add that the perl numbers might also improve
if the print in the loop were postponed.

[toc] | [prev] | [next] | [standalone]


#105134

FromEthan Furman <ethan@stoneleaf.us>
Date2016-03-17 11:21 -0700
Message-ID<mailman.282.1458238899.12893.python-list@python.org>
In reply to#105126
On 03/17/2016 10:35 AM, Charles T. Smith wrote:
> On Thu, 17 Mar 2016 10:26:12 -0700, Ethan Furman wrote:
>
>> On 03/17/2016 09:36 AM, Charles T. Smith wrote:
>>
>>> Yes, your point was to forgo REs despite that they are useful.
>>> I could have thought the search would have been better as:
>>>
>>>       'release[-.:][Rr]eq'
>>>
>>> or something else ... you're in a "defend python at all costs!" mode.
>>
>> No, I'm in the "don't try to write <language X> in Python" mode, and
>> "don't use 10lb sledge when 6oz hammer will do" mode:
>
>
> Yes, fine.
>
> I'd only like to add that the perl numbers might also improve
> if the print in the loop were postponed.

Yup, might!  Try it and see.

--
~Ethan~

[toc] | [prev] | [next] | [standalone]


Page 1 of 3  [1] 2 3  Next page →

Back to top | Article view | comp.lang.python


csiph-web