Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #50503 > unrolled thread

RE Module Performance

Started byDevyn Collier Johnson <devyncjohnson@gmail.com>
First post2013-07-11 19:44 -0400
Last post2013-07-18 13:17 -0700
Articles 20 on this page of 136 — 25 participants

Back to article view | Back to comp.lang.python


Contents

  RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-11 19:44 -0400
    Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-12 02:23 -0700
      Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-12 19:27 +1000
      Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-12 10:39 +0100
      Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-12 19:40 +1000
      Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-12 06:45 -0400
      Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-12 16:59 +0100
      Re: RE Module Performance Peter Otten <__peter__@web.de> - 2013-07-12 18:15 +0200
      Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-13 02:21 +1000
      Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-12 13:58 -0400
        Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 05:37 +0000
          Re: RE Module Performance 88888 Dihedral <dihedral88888@gmail.com> - 2013-07-14 11:17 -0700
            Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-15 06:06 -0400
              Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-15 12:36 +0000
                Dihedral Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-15 08:52 -0400
                Re: Dihedral Joel Goldstick <joel.goldstick@gmail.com> - 2013-07-15 09:03 -0400
                Re: Dihedral Wayne Werner <wayne@waynewerner.com> - 2013-07-15 17:43 -0500
                Re: Dihedral Fábio Santos <fabiosantosart@gmail.com> - 2013-07-15 23:54 +0100
                Re: Dihedral Chris Angelico <rosuav@gmail.com> - 2013-07-16 08:59 +1000
                Re: Dihedral Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-16 16:06 +1000
                Re: Dihedral Stefan Behnel <stefan_ml@behnel.de> - 2013-07-24 20:08 +0200
                Re: Dihedral Chris Angelico <rosuav@gmail.com> - 2013-07-25 04:23 +1000
                Re: Dihedral Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-07-24 20:15 -0400
      Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-13 08:16 +1000
      Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-12 17:13 -0600
        Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-24 06:40 -0700
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-24 23:48 +1000
          Re: RE Module Performance David Hutto <dwightdhutto@gmail.com> - 2013-07-24 10:17 -0400
          Re: RE Module Performance David Hutto <dwightdhutto@gmail.com> - 2013-07-24 10:19 -0400
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 00:34 +1000
            Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 07:02 +0000
              Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 17:39 +1000
          Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 08:47 -0600
            Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-25 02:27 -0700
              Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 20:14 +1000
                Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-25 12:07 -0700
                  Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 05:18 +1000
                  RE: RE Module Performance "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2013-07-25 19:30 +0000
                  Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-25 21:06 -0600
          Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 09:00 -0600
            Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 05:56 +0000
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 00:56 +1000
          Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-24 13:52 -0400
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 04:15 +1000
            Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 07:15 +0000
              Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 17:58 +1000
                Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 09:22 +0000
                  Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 20:07 +1000
          Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-24 18:09 -0400
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 08:19 +1000
          Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 16:59 -0600
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 09:24 +1000
          Re: RE Module Performance Serhiy Storchaka <storchaka@gmail.com> - 2013-07-25 08:49 +0300
          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 15:58 +1000
          Re: RE Module Performance Jeremy Sanders <jeremy@jeremysanders.net> - 2013-07-25 14:36 +0100
            Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 15:26 +0000
              Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 01:36 +1000
                Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 17:18 +0000
                  Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 03:27 +1000
                  Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 15:45 -0500
                    Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-26 02:48 +0000
                      Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 21:20 -0600
                        Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:36 -0700
                        Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 08:46 -0700
                          Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 06:28 +0000
                        Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 03:37 +0000
                          Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-26 22:12 -0600
                            Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 05:04 +0000
                          Re: RE Module Performance Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-07-27 12:13 -0400
                    Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:19 -0700
                  Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-25 21:09 -0600
                    Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:21 -0700
                      Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-26 20:05 -0600
                        Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-27 11:21 -0700
                          Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-27 21:53 -0600
                            Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 11:13 -0700
                              Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-28 20:04 +0100
                                Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 12:30 -0700
                                  Re: RE Module Performance Lele Gaifax <lele@metapensiero.it> - 2013-07-28 22:45 +0200
                                  Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 22:01 +0200
                            Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-30 07:01 -0700
                              Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 16:38 +0200
                              Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-30 15:45 +0100
                              Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-30 17:13 +0100
                              Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 18:39 +0200
                              Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-30 18:14 +0100
                                Re: RE Module Performance Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-31 13:09 +1000
                              Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-31 03:27 +1000
                              Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-30 18:40 +0100
                              Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 20:19 +0200
                                Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-30 12:09 -0700
                                  Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-30 21:04 +0100
                                  Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-30 21:54 -0600
                                  Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-31 05:45 +0000
                                    Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-31 08:17 +0100
                                    Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-31 13:15 -0700
                                      Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-31 21:41 +0100
                                  Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 10:11 +0200
                                    Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-31 01:32 -0700
                                      Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 10:59 +0200
                                      Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-31 08:44 -0600
                              Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-30 17:05 -0400
                              Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-30 21:30 -0600
                              Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 09:23 +0200
                              Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-31 08:27 -0600
                          Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 10:45 +0200
                          FSR and unicode compliance - was Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-28 09:52 -0600
                            Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 12:23 -0700
                              Re: FSR and unicode compliance - was Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-28 20:44 +0100
                              Re: FSR and unicode compliance - was Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 21:55 +0200
                              Re: FSR and unicode compliance - was Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-28 20:52 +0000
                                Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 04:43 -0700
                                  Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 12:57 +0100
                                    Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 05:56 -0700
                                    Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 07:20 -0700
                                      Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 15:49 +0100
                                        Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 09:31 -0700
                                  Re: FSR and unicode compliance - was Re: RE Module Performance Heiko Wundram <modelnine@modelnine.org> - 2013-07-29 14:06 +0200
                                  Re: FSR and unicode compliance - was Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-29 08:43 -0400
                          Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 18:03 +0100
                          Re: FSR and unicode compliance - was Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-28 13:36 -0400
                            Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 06:36 -0700
                          Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 19:03 +0100
                          Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-28 19:19 +0100
                          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 19:29 +0100
                          Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-28 15:06 -0400
                          Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-28 23:14 +0100
                          Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 20:51 +0200
                          Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 00:07 +0100
                      Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-26 22:38 +0200
          Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-25 09:44 -0400
          Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 15:53 -0500
      Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-13 00:16 +0100
      Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-14 05:34 +1000
      Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-16 06:30 -0400
        Re: RE Module Performance 88888 Dihedral <dihedral88888@gmail.com> - 2013-07-18 13:17 -0700

Page 1 of 7  [1] 2 3 4 5 6 7  Next page →


#50503 — RE Module Performance

FromDevyn Collier Johnson <devyncjohnson@gmail.com>
Date2013-07-11 19:44 -0400
SubjectRE Module Performance
Message-ID<mailman.4618.1373613834.3114.python-list@python.org>
I recently saw an email in this mailing list about the RE module being 
made slower. I no long have that email. However, I have viewed the 
source for the RE module, but I did not see any code that would slow 
down the script for no valid reason. Can anyone explain what that user 
meant or if I missed that part of the module?

Can the RE module be optimized in any way or at least the "re.sub" portion?

Mahalo,

Devyn Collier Johnson
DevynCJohnson@Gmail.com

[toc] | [next] | [standalone]


#50510

Fromwxjmfauth@gmail.com
Date2013-07-12 02:23 -0700
Message-ID<571a6dfe-fd66-42cf-92fc-8b97cbe6e9e4@googlegroups.com>
In reply to#50503
Le vendredi 12 juillet 2013 01:44:05 UTC+2, Devyn Collier Johnson a écrit :
> I recently saw an email in this mailing list about the RE module being 
> 
> made slower. I no long have that email. However, I have viewed the 
> 
> source for the RE module, but I did not see any code that would slow 
> 
> down the script for no valid reason. Can anyone explain what that user 
> 
> meant or if I missed that part of the module?
> 
> 
> 
> Can the RE module be optimized in any way or at least the "re.sub" portion?
> 
> 
> 
> Mahalo,
> 
> 
> 
> Devyn Collier Johnson
> 
> DevynCJohnson@Gmail.com

----------

I would not care too much about the performance
of re.

With the new Flexible String Representation, you
can use a logarithmic scale to compare re results.
To be honest, there is improvment if you are an
ascii user.

Am I the only one who tested this? Probably.

jmf

[toc] | [prev] | [next] | [standalone]


#50511

FromChris Angelico <rosuav@gmail.com>
Date2013-07-12 19:27 +1000
Message-ID<mailman.4624.1373621267.3114.python-list@python.org>
In reply to#50510
On Fri, Jul 12, 2013 at 7:23 PM,  <wxjmfauth@gmail.com> wrote:
>
> I would not care too much about the performance
> of re.
>
> With the new Flexible String Representation, you
> can use a logarithmic scale to compare re results.
> To be honest, there is improvment if you are an
> ascii user.
>
> Am I the only one who tested this? Probably.

Am I the only one who thinks that Dihedral posted under jmf's name?

ChrisA

[toc] | [prev] | [next] | [standalone]


#50512

FromJoshua Landau <joshua@landau.ws>
Date2013-07-12 10:39 +0100
Message-ID<mailman.4625.1373621984.3114.python-list@python.org>
In reply to#50510
On 12 July 2013 10:27, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Jul 12, 2013 at 7:23 PM,  <wxjmfauth@gmail.com> wrote:
>>
>> I would not care too much about the performance
>> of re.
>>
>> With the new Flexible String Representation, you
>> can use a logarithmic scale to compare re results.
>> To be honest, there is improvment if you are an
>> ascii user.
>>
>> Am I the only one who tested this? Probably.
>
> Am I the only one who thinks that Dihedral posted under jmf's name?

A bot posting as a troll to troll a different troll? Meta.

[toc] | [prev] | [next] | [standalone]


#50513

FromChris Angelico <rosuav@gmail.com>
Date2013-07-12 19:40 +1000
Message-ID<mailman.4626.1373622056.3114.python-list@python.org>
In reply to#50510
On Fri, Jul 12, 2013 at 7:39 PM, Joshua Landau <joshua@landau.ws> wrote:
> On 12 July 2013 10:27, Chris Angelico <rosuav@gmail.com> wrote:
>> On Fri, Jul 12, 2013 at 7:23 PM,  <wxjmfauth@gmail.com> wrote:
>>>
>>> I would not care too much about the performance
>>> of re.
>>>
>>> With the new Flexible String Representation, you
>>> can use a logarithmic scale to compare re results.
>>> To be honest, there is improvment if you are an
>>> ascii user.
>>>
>>> Am I the only one who tested this? Probably.
>>
>> Am I the only one who thinks that Dihedral posted under jmf's name?
>
> A bot posting as a troll to troll a different troll? Meta.

Yeah, it is. But the only other explanation is that jmf has become
rather more incomprehensible than usual. Normally I can understand
what he's complaining enough to refute it, but here I feel like I'm
responding to Dihedral.

ChrisA

[toc] | [prev] | [next] | [standalone]


#50533

FromDevyn Collier Johnson <devyncjohnson@gmail.com>
Date2013-07-12 06:45 -0400
Message-ID<mailman.4639.1373643904.3114.python-list@python.org>
In reply to#50510
Could you explain what you mean? What and where is the new Flexible 
String Representation?

Devyn Collier Johnson


On 07/12/2013 05:23 AM, wxjmfauth@gmail.com wrote:
> Le vendredi 12 juillet 2013 01:44:05 UTC+2, Devyn Collier Johnson a écrit :
>> I recently saw an email in this mailing list about the RE module being
>>
>> made slower. I no long have that email. However, I have viewed the
>>
>> source for the RE module, but I did not see any code that would slow
>>
>> down the script for no valid reason. Can anyone explain what that user
>>
>> meant or if I missed that part of the module?
>>
>>
>>
>> Can the RE module be optimized in any way or at least the "re.sub" portion?
>>
>>
>>
>> Mahalo,
>>
>>
>>
>> Devyn Collier Johnson
>>
>> DevynCJohnson@Gmail.com
> ----------
>
> I would not care too much about the performance
> of re.
>
> With the new Flexible String Representation, you
> can use a logarithmic scale to compare re results.
> To be honest, there is improvment if you are an
> ascii user.
>
> Am I the only one who tested this? Probably.
>
> jmf
>
>

[toc] | [prev] | [next] | [standalone]


#50537

FromJoshua Landau <joshua@landau.ws>
Date2013-07-12 16:59 +0100
Message-ID<mailman.4643.1373644803.3114.python-list@python.org>
In reply to#50510
On 12 July 2013 11:45, Devyn Collier Johnson <devyncjohnson@gmail.com> wrote:
> Could you explain what you mean? What and where is the new Flexible String
> Representation?

Do not worry. jmf is on about his old rant comparing broken previous
versions of Python to newer ones which in some microbenchmarks are
slower. I don't really get why he spends his time on it.

If you're interested, the basic of it is that strings now use a
variable number of bytes to encode their values depending on whether
values outside of the ASCII range and some other range are used, as an
optimisation.

[toc] | [prev] | [next] | [standalone]


#50539

FromPeter Otten <__peter__@web.de>
Date2013-07-12 18:15 +0200
Message-ID<mailman.4645.1373645725.3114.python-list@python.org>
In reply to#50510
Joshua Landau wrote:

> On 12 July 2013 11:45, Devyn Collier Johnson <devyncjohnson@gmail.com>
> wrote:
>> Could you explain what you mean? What and where is the new Flexible
>> String Representation?
> 
> Do not worry. jmf is on about his old rant comparing broken previous
> versions of Python to newer ones which in some microbenchmarks are
> slower. I don't really get why he spends his time on it.
> 
> If you're interested, the basic of it is that strings now use a
> variable number of bytes to encode their values depending on whether
> values outside of the ASCII range and some other range are used, as an
> optimisation.

See also <http://www.python.org/dev/peps/pep-0393/>

[toc] | [prev] | [next] | [standalone]


#50540

FromChris Angelico <rosuav@gmail.com>
Date2013-07-13 02:21 +1000
Message-ID<mailman.4646.1373646092.3114.python-list@python.org>
In reply to#50510
On Fri, Jul 12, 2013 at 8:45 PM, Devyn Collier Johnson
<devyncjohnson@gmail.com> wrote:
> Could you explain what you mean? What and where is the new Flexible String
> Representation?

(You're top-posting again. Please put your text underneath what you're
responding to - it helps maintain flow and structure.)

Python versions up to and including 3.2 came in two varieties: narrow
builds (commonly found on Windows) and wide builds (commonly found on
Linux). Narrow builds internally represented Unicode strings in
UTF-16, while wide builds used UTF-32. This is a problem, because it
means that taking a program from one to another actually changes its
behaviour:

Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> len(u"\U00012345")
1

Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit
(Intel)] on win32
>>> len(u"\U00012345")
2

In fact, the narrow builds are flat-out buggy, because you can put
something in as a single character that simply isn't a single
character. You can then pull that out as two characters and make a
huge mess of things:

>>> s=u"\U00012345"
>>> s[0]
u'\ud808'
>>> s[1]
u'\udf45'

*Any* string indexing will be broken if there is a single character
>U+FFFF ahead of it in the string.

Now, this problem is not unique to Python. Heaps of other languages
have the same issue, the same buggy behaviour, the same compromises.
What's special about Python is that it actually managed to come back
from that problem. (Google's V8 JavaScript engine, for instance, is
stuck with it, because the ECMAScript specification demands UTF-16. I
asked on an ECMAScript list and was told "can't change that, it'd
break code". So it's staying buggy.)

There are a number of languages that take the Texan RAM-guzzling
approach of storing all strings in UTF-32; Python (since version 3.3)
is among a *very* small number of languages that store strings in
multiple different ways according to their content. That's described
in PEP 393 [1], titled "Flexible String Representation". It details a
means whereby a Python string will be represented in, effectively,
UTF-32 with some of the leading zero bytes elided. Or if you prefer,
in either Latin-1, UCS-2, or UCS-4, whichever's the smallest it can
fit in. The difference between a string stored one-byte-per-character
and a string stored four-bytes-per-character is almost invisible to a
Python script; you can find out by checking the string's memory usage,
but otherwise you don't need to worry about it.

Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
32 bit (Intel)] on win32
>>> sys.getsizeof("asdfasdfasdfasd")
40
>>> sys.getsizeof("asdfasdfasdfasdf")
41

Adding another character adds another 1 byte. (There's quite a bit of
overhead for small strings - GC headers and such - but it gets dwarfed
by the actual content after a while.)

>>> sys.getsizeof("\u1000sdfasdfasdfasd")
68
>>> sys.getsizeof("\u1000sdfasdfasdfasdf")
70

Two bytes to add another character.

>>> sys.getsizeof("\U00010001sdfasdfasdfasd")
100
>>> sys.getsizeof("\U00010001sdfasdfasdfasdf")
104

Four bytes. It uses only what it needs.

Strings in Python are immutable, so there's no need to worry about
up-grading or down-grading a string; there are a few optimizations
that can't be done, but they're fairly trivial. Look, I'll pull a jmf
and find a microbenchmark that makes 3.3 look worse:

2.7.4:
>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
[0.8175005482540385, 0.789617954237201, 0.8152240019332098]
>>> timeit.repeat('a=u"A"*100; a+=u"a"')
[0.8088905154146744, 0.8123691698246631, 0.8172558244134365]

3.3.0:
>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
[0.9623714745976031, 0.970628669281723, 0.9696310564468149]
>>> timeit.repeat('a=u"A"*100; a+=u"a"')
[0.7017891938739922, 0.7024725209339522, 0.6989539173082449]

See? It's clearly worse on the newer Python! But actually, this is an
extremely unusual situation, and 3.3 outperforms 2.7 on the more
common case (where the two strings are of the same width).

Python's PEP 393 strings are following the same sort of model as the
native string type in a semantically-similar but
syntactically-different language, Pike. In Pike (also free software,
like Python), the string type can be indexed character by character,
and each character can be anything in the Unicode range; and just as
in Python 3.3, memory usage goes up by just one byte if every
character in the string fits inside 8 bits. So it's not as if this is
an untested notion; Pike has been running like this for years (I don't
know how long it's had this functionality, but it won't be more than
18 years as Unicode didn't have multiple planes until 1996), and
performance has been *just fine* for all that time. Pike tends to be
run on servers, so memory usage and computation speed translate fairly
directly into TPS. And there are some sizeable commercial entities
using and developing Pike, so if the flexible string representation
had turned out to be a flop, someone would have put in the coding time
to rewrite it by now.

And yet, despite all these excellent reasons for moving to this way of
doing strings, jmf still sees his microbenchmarks as more important,
and so he jumps in on threads like this to whine about how Python 3.3
is somehow US-centric because it more efficiently handles the entire
Unicode range. I'd really like to take some highlights from Python and
Pike and start recommending that other languages take up the ideas,
but to be honest, I hesitate to inflict jmf on them all. ECMAScript
may have the right idea after all - stay with UTF-16 and avoid
answering jmf's stupid objections every week.

[1] http://www.python.org/dev/peps/pep-0393/

ChrisA

[toc] | [prev] | [next] | [standalone]


#50552

FromDevyn Collier Johnson <devyncjohnson@gmail.com>
Date2013-07-12 13:58 -0400
Message-ID<mailman.4656.1373657874.3114.python-list@python.org>
In reply to#50510
On 07/12/2013 12:21 PM, Chris Angelico wrote:
> On Fri, Jul 12, 2013 at 8:45 PM, Devyn Collier Johnson
> <devyncjohnson@gmail.com> wrote:
>> Could you explain what you mean? What and where is the new Flexible String
>> Representation?
> (You're top-posting again. Please put your text underneath what you're
> responding to - it helps maintain flow and structure.)
>
> Python versions up to and including 3.2 came in two varieties: narrow
> builds (commonly found on Windows) and wide builds (commonly found on
> Linux). Narrow builds internally represented Unicode strings in
> UTF-16, while wide builds used UTF-32. This is a problem, because it
> means that taking a program from one to another actually changes its
> behaviour:
>
> Python 2.6.4 (r264:75706, Dec  7 2009, 18:45:15)
> [GCC 4.4.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> len(u"\U00012345")
> 1
>
> Python 2.7.4 (default, Apr  6 2013, 19:54:46) [MSC v.1500 32 bit
> (Intel)] on win32
>>>> len(u"\U00012345")
> 2
>
> In fact, the narrow builds are flat-out buggy, because you can put
> something in as a single character that simply isn't a single
> character. You can then pull that out as two characters and make a
> huge mess of things:
>
>>>> s=u"\U00012345"
>>>> s[0]
> u'\ud808'
>>>> s[1]
> u'\udf45'
>
> *Any* string indexing will be broken if there is a single character
>> U+FFFF ahead of it in the string.
> Now, this problem is not unique to Python. Heaps of other languages
> have the same issue, the same buggy behaviour, the same compromises.
> What's special about Python is that it actually managed to come back
> from that problem. (Google's V8 JavaScript engine, for instance, is
> stuck with it, because the ECMAScript specification demands UTF-16. I
> asked on an ECMAScript list and was told "can't change that, it'd
> break code". So it's staying buggy.)
>
> There are a number of languages that take the Texan RAM-guzzling
> approach of storing all strings in UTF-32; Python (since version 3.3)
> is among a *very* small number of languages that store strings in
> multiple different ways according to their content. That's described
> in PEP 393 [1], titled "Flexible String Representation". It details a
> means whereby a Python string will be represented in, effectively,
> UTF-32 with some of the leading zero bytes elided. Or if you prefer,
> in either Latin-1, UCS-2, or UCS-4, whichever's the smallest it can
> fit in. The difference between a string stored one-byte-per-character
> and a string stored four-bytes-per-character is almost invisible to a
> Python script; you can find out by checking the string's memory usage,
> but otherwise you don't need to worry about it.
>
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
> 32 bit (Intel)] on win32
>>>> sys.getsizeof("asdfasdfasdfasd")
> 40
>>>> sys.getsizeof("asdfasdfasdfasdf")
> 41
>
> Adding another character adds another 1 byte. (There's quite a bit of
> overhead for small strings - GC headers and such - but it gets dwarfed
> by the actual content after a while.)
>
>>>> sys.getsizeof("\u1000sdfasdfasdfasd")
> 68
>>>> sys.getsizeof("\u1000sdfasdfasdfasdf")
> 70
>
> Two bytes to add another character.
>
>>>> sys.getsizeof("\U00010001sdfasdfasdfasd")
> 100
>>>> sys.getsizeof("\U00010001sdfasdfasdfasdf")
> 104
>
> Four bytes. It uses only what it needs.
>
> Strings in Python are immutable, so there's no need to worry about
> up-grading or down-grading a string; there are a few optimizations
> that can't be done, but they're fairly trivial. Look, I'll pull a jmf
> and find a microbenchmark that makes 3.3 look worse:
>
> 2.7.4:
>>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
> [0.8175005482540385, 0.789617954237201, 0.8152240019332098]
>>>> timeit.repeat('a=u"A"*100; a+=u"a"')
> [0.8088905154146744, 0.8123691698246631, 0.8172558244134365]
>
> 3.3.0:
>>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
> [0.9623714745976031, 0.970628669281723, 0.9696310564468149]
>>>> timeit.repeat('a=u"A"*100; a+=u"a"')
> [0.7017891938739922, 0.7024725209339522, 0.6989539173082449]
>
> See? It's clearly worse on the newer Python! But actually, this is an
> extremely unusual situation, and 3.3 outperforms 2.7 on the more
> common case (where the two strings are of the same width).
>
> Python's PEP 393 strings are following the same sort of model as the
> native string type in a semantically-similar but
> syntactically-different language, Pike. In Pike (also free software,
> like Python), the string type can be indexed character by character,
> and each character can be anything in the Unicode range; and just as
> in Python 3.3, memory usage goes up by just one byte if every
> character in the string fits inside 8 bits. So it's not as if this is
> an untested notion; Pike has been running like this for years (I don't
> know how long it's had this functionality, but it won't be more than
> 18 years as Unicode didn't have multiple planes until 1996), and
> performance has been *just fine* for all that time. Pike tends to be
> run on servers, so memory usage and computation speed translate fairly
> directly into TPS. And there are some sizeable commercial entities
> using and developing Pike, so if the flexible string representation
> had turned out to be a flop, someone would have put in the coding time
> to rewrite it by now.
>
> And yet, despite all these excellent reasons for moving to this way of
> doing strings, jmf still sees his microbenchmarks as more important,
> and so he jumps in on threads like this to whine about how Python 3.3
> is somehow US-centric because it more efficiently handles the entire
> Unicode range. I'd really like to take some highlights from Python and
> Pike and start recommending that other languages take up the ideas,
> but to be honest, I hesitate to inflict jmf on them all. ECMAScript
> may have the right idea after all - stay with UTF-16 and avoid
> answering jmf's stupid objections every week.
>
> [1] http://www.python.org/dev/peps/pep-0393/
>
> ChrisA
Thanks for the thorough response. I learned a lot. You should write 
articles on Python.
I plan to spend some time optimizing the re.py module for Unix systems. 
I would love to amp up my programs that use that module.

Devyn Collier Johnson

[toc] | [prev] | [next] | [standalone]


#50571

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-07-13 05:37 +0000
Message-ID<51e0e7aa$0$9505$c3e8da3$5496439d@news.astraweb.com>
In reply to#50552
On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:

> I plan to spend some time optimizing the re.py module for Unix systems.
> I would love to amp up my programs that use that module.

In my experience, often the best way to optimize a regex is to not use it 
at all.

[steve@ando ~]$ python -m timeit -s "import re" \
> -s "data = 'a'*100+'b'" \
> "if re.search('b', data): pass"
100000 loops, best of 3: 2.77 usec per loop

[steve@ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
> "if 'b' in data: pass"
1000000 loops, best of 3: 0.219 usec per loop

In Python, we often use plain string operations instead of regex-based 
solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use 
them for cracking peanuts.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#50647

From88888 Dihedral <dihedral88888@gmail.com>
Date2013-07-14 11:17 -0700
Message-ID<9a97b618-3e80-4149-9155-14fb210a0758@googlegroups.com>
In reply to#50571
On Saturday, July 13, 2013 1:37:46 PM UTC+8, Steven D'Aprano wrote:
> On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:
> 
> 
> 
> > I plan to spend some time optimizing the re.py module for Unix systems.
> 
> > I would love to amp up my programs that use that module.
> 
> 
> 
> In my experience, often the best way to optimize a regex is to not use it 
> 
> at all.
> 
> 
> 
> [steve@ando ~]$ python -m timeit -s "import re" \
> 
> > -s "data = 'a'*100+'b'" \
> 
> > "if re.search('b', data): pass"
> 
> 100000 loops, best of 3: 2.77 usec per loop
> 
> 
> 
> [steve@ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
> 
> > "if 'b' in data: pass"
> 
> 1000000 loops, best of 3: 0.219 usec per loop
> 
> 
> 
> In Python, we often use plain string operations instead of regex-based 
> 
> solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use 
> 
> them for cracking peanuts.
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven

OK, lets talk about the indexed search algorithms of 
a character streamor strig which can be buffered and
indexed randomly for RW operations but faster in sequential 
block RW operations after some pre-processing.

This was solved long time ago in the suffix array or 
suffix tree part and summarized in the famous BWT paper in 199X.

Do we want volunteers to speed up 
search operations in the string module in Python?

[toc] | [prev] | [next] | [standalone]


#50666

FromDevyn Collier Johnson <devyncjohnson@gmail.com>
Date2013-07-15 06:06 -0400
Message-ID<mailman.4710.1373882772.3114.python-list@python.org>
In reply to#50647
On 07/14/2013 02:17 PM, 88888 Dihedral wrote:
> On Saturday, July 13, 2013 1:37:46 PM UTC+8, Steven D'Aprano wrote:
>> On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:
>>
>>
>>
>>> I plan to spend some time optimizing the re.py module for Unix systems.
>>> I would love to amp up my programs that use that module.
>>
>>
>> In my experience, often the best way to optimize a regex is to not use it
>>
>> at all.
>>
>>
>>
>> [steve@ando ~]$ python -m timeit -s "import re" \
>>
>>> -s "data = 'a'*100+'b'" \
>>> "if re.search('b', data): pass"
>> 100000 loops, best of 3: 2.77 usec per loop
>>
>>
>>
>> [steve@ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
>>
>>> "if 'b' in data: pass"
>> 1000000 loops, best of 3: 0.219 usec per loop
>>
>>
>>
>> In Python, we often use plain string operations instead of regex-based
>>
>> solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use
>>
>> them for cracking peanuts.
>>
>>
>>
>>
>>
>>
>>
>> -- 
>>
>> Steven
> OK, lets talk about the indexed search algorithms of
> a character streamor strig which can be buffered and
> indexed randomly for RW operations but faster in sequential
> block RW operations after some pre-processing.
>
> This was solved long time ago in the suffix array or
> suffix tree part and summarized in the famous BWT paper in 199X.
>
> Do we want volunteers to speed up
> search operations in the string module in Python?
It would be nice if someone could speed it up.

[toc] | [prev] | [next] | [standalone]


#50675

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-07-15 12:36 +0000
Message-ID<51e3ecbd$0$9505$c3e8da3$5496439d@news.astraweb.com>
In reply to#50666
On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote:

> On 07/14/2013 02:17 PM, 88888 Dihedral wrote:
[...]
>> Do we want volunteers to speed up
>> search operations in the string module in Python?
>
> It would be nice if someone could speed it up.

Devyn,

88888 Dihedral is our resident bot, not a human being. Nobody knows who 
controls it, and why they are running it, but we are pretty certain that 
it is a bot responding mechanically to keywords in people's posts.

It's a very clever bot, but still a bot. About one post in four is 
meaningless jargon, the other three are relevant enough to fool people 
into thinking that maybe it is a human being. It had me fooled for a long 
time.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#50679 — Dihedral

FromDevyn Collier Johnson <devyncjohnson@gmail.com>
Date2013-07-15 08:52 -0400
SubjectDihedral
Message-ID<mailman.4721.1373892779.3114.python-list@python.org>
In reply to#50675
On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
> On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote:
>
>> On 07/14/2013 02:17 PM, 88888 Dihedral wrote:
> [...]
>>> Do we want volunteers to speed up
>>> search operations in the string module in Python?
>> It would be nice if someone could speed it up.
> Devyn,
>
> 88888 Dihedral is our resident bot, not a human being. Nobody knows who
> controls it, and why they are running it, but we are pretty certain that
> it is a bot responding mechanically to keywords in people's posts.
>
> It's a very clever bot, but still a bot. About one post in four is
> meaningless jargon, the other three are relevant enough to fool people
> into thinking that maybe it is a human being. It had me fooled for a long
> time.
>
>
>
Wow! Our mailing list has a pet bot. I bet other mailing lists are so 
jealous of us. Who ever created Dihedral is a genius!

Artificial Intelligence developers put chatbots on mailing lists so that 
the program can learn. I use Python3 to program AI applications. If you 
see my Launchpad account, you will see my two AI projects - Neobot and 
Novabot. (https://launchpad.net/neobot Neo and Nova are still unstable) 
AI developers let their bots loose on the Internet to learn from people. 
Dihedral is learning from us. Dihedral only responses when it feels it 
has sufficient knowledge on the topic. Chatbots want to appear human. 
That is their goal. We should feel honored that Dihedral's botmaster 
feels that this mailinglist would benefit the development of Dihedral's 
knowledge.

Devyn Collier Johnson

[toc] | [prev] | [next] | [standalone]


#50686 — Re: Dihedral

FromJoel Goldstick <joel.goldstick@gmail.com>
Date2013-07-15 09:03 -0400
SubjectRe: Dihedral
Message-ID<mailman.4728.1373893408.3114.python-list@python.org>
In reply to#50675

[Multipart message — attachments visible in raw view] — view raw

On Mon, Jul 15, 2013 at 8:52 AM, Devyn Collier Johnson <
devyncjohnson@gmail.com> wrote:

>
> On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
>
>> On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote:
>>
>>  On 07/14/2013 02:17 PM, 88888 Dihedral wrote:
>>>
>> [...]
>>
>>> Do we want volunteers to speed up
>>>> search operations in the string module in Python?
>>>>
>>> It would be nice if someone could speed it up.
>>>
>> Devyn,
>>
>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who
>> controls it, and why they are running it, but we are pretty certain that
>> it is a bot responding mechanically to keywords in people's posts.
>>
>> It's a very clever bot, but still a bot. About one post in four is
>> meaningless jargon, the other three are relevant enough to fool people
>> into thinking that maybe it is a human being. It had me fooled for a long
>> time.
>>
>>
>>
>>  Wow! Our mailing list has a pet bot. I bet other mailing lists are so
> jealous of us. Who ever created Dihedral is a genius!
>
> Artificial Intelligence developers put chatbots on mailing lists so that
> the program can learn. I use Python3 to program AI applications. If you see
> my Launchpad account, you will see my two AI projects - Neobot and Novabot.
> (https://launchpad.net/neobot Neo and Nova are still unstable) AI
> developers let their bots loose on the Internet to learn from people.
> Dihedral is learning from us. Dihedral only responses when it feels it has
> sufficient knowledge on the topic. Chatbots want to appear human. That is
> their goal. We should feel honored that Dihedral's botmaster feels that
> this mailinglist would benefit the development of Dihedral's knowledge.
>
> Devyn Collier Johnson
> --
> http://mail.python.org/**mailman/listinfo/python-list<http://mail.python.org/mailman/listinfo/python-list>
>

I particularly enjoy the misspellings, that seem to be such a human quality
on email messages!

-- 
Joel Goldstick
http://joelgoldstick.com

[toc] | [prev] | [next] | [standalone]


#50713 — Re: Dihedral

FromWayne Werner <wayne@waynewerner.com>
Date2013-07-15 17:43 -0500
SubjectRe: Dihedral
Message-ID<mailman.4747.1373928202.3114.python-list@python.org>
In reply to#50675
On Mon, 15 Jul 2013, Devyn Collier Johnson wrote:

>
> On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
>> On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote:
>> 
>>> On 07/14/2013 02:17 PM, 88888 Dihedral wrote:
>> [...]
>>>> Do we want volunteers to speed up
>>>> search operations in the string module in Python?
>>> It would be nice if someone could speed it up.
>> Devyn,
>> 
>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who
>> controls it, and why they are running it, but we are pretty certain that
>> it is a bot responding mechanically to keywords in people's posts.
>> 
>> It's a very clever bot, but still a bot. About one post in four is
>> meaningless jargon, the other three are relevant enough to fool people
>> into thinking that maybe it is a human being. It had me fooled for a long
>> time.
>> 
>> 
>> 
> Wow! Our mailing list has a pet bot. I bet other mailing lists are so jealous 
> of us. Who ever created Dihedral is a genius!
>
> Artificial Intelligence developers put chatbots on mailing lists so that the 
> program can learn. I use Python3 to program AI applications. If you see my 
> Launchpad account, you will see my two AI projects - Neobot and Novabot. 
> (https://launchpad.net/neobot Neo and Nova are still unstable) AI developers 
> let their bots loose on the Internet to learn from people. Dihedral is 
> learning from us. Dihedral only responses when it feels it has sufficient 
> knowledge on the topic. Chatbots want to appear human. That is their goal. We 
> should feel honored that Dihedral's botmaster feels that this mailinglist 
> would benefit the development of Dihedral's knowledge.

Are *you* a bot? ~_^

That post felt surprisingly like Dihedral...

-W

[toc] | [prev] | [next] | [standalone]


#50715 — Re: Dihedral

FromFábio Santos <fabiosantosart@gmail.com>
Date2013-07-15 23:54 +0100
SubjectRe: Dihedral
Message-ID<mailman.4749.1373928846.3114.python-list@python.org>
In reply to#50675

[Multipart message — attachments visible in raw view] — view raw

> On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
>>
>> Devyn,
>>
>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who
>> controls it, and why they are running it, but we are pretty certain that
>> it is a bot responding mechanically to keywords in people's posts.
>>
>> It's a very clever bot, but still a bot. About one post in four is
>> meaningless jargon, the other three are relevant enough to fool people
>> into thinking that maybe it is a human being. It had me fooled for a long
>> time.
>>

Does this mean he passes the Turing test?

[toc] | [prev] | [next] | [standalone]


#50716 — Re: Dihedral

FromChris Angelico <rosuav@gmail.com>
Date2013-07-16 08:59 +1000
SubjectRe: Dihedral
Message-ID<mailman.4750.1373929147.3114.python-list@python.org>
In reply to#50675
On Tue, Jul 16, 2013 at 8:54 AM, Fábio Santos <fabiosantosart@gmail.com> wrote:
>
>> On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
>>>
>>> Devyn,
>>>
>>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who
>>> controls it, and why they are running it, but we are pretty certain that
>>> it is a bot responding mechanically to keywords in people's posts.
>>>
>>> It's a very clever bot, but still a bot. About one post in four is
>>> meaningless jargon, the other three are relevant enough to fool people
>>> into thinking that maybe it is a human being. It had me fooled for a long
>>> time.
>>>
>
> Does this mean he passes the Turing test?

Yes, absolutely. The original Turing test was defined in terms of five
minutes of analysis, and Dihedral and jmf have clearly been
indistinguishably human across that approximate period.

ChrisA

[toc] | [prev] | [next] | [standalone]


#50727 — Re: Dihedral

FromTim Delaney <timothy.c.delaney@gmail.com>
Date2013-07-16 16:06 +1000
SubjectRe: Dihedral
Message-ID<mailman.4756.1373954816.3114.python-list@python.org>
In reply to#50675

[Multipart message — attachments visible in raw view] — view raw

On 16 July 2013 08:59, Chris Angelico <rosuav@gmail.com> wrote:

> On Tue, Jul 16, 2013 at 8:54 AM, Fábio Santos <fabiosantosart@gmail.com>
> wrote:
> >
> >> On 07/15/2013 08:36 AM, Steven D'Aprano wrote:
> >>>
> >>> Devyn,
> >>>
> >>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who
> >>> controls it, and why they are running it, but we are pretty certain
> that
> >>> it is a bot responding mechanically to keywords in people's posts.
> >>>
> >>> It's a very clever bot, but still a bot. About one post in four is
> >>> meaningless jargon, the other three are relevant enough to fool people
> >>> into thinking that maybe it is a human being. It had me fooled for a
> long
> >>> time.
> >>>
> >
> > Does this mean he passes the Turing test?
>
> Yes, absolutely. The original Turing test was defined in terms of five
> minutes of analysis, and Dihedral and jmf have clearly been
> indistinguishably human across that approximate period.
>

The big difference between them is that the jmfbot does not appear to
evolve its routines in response to external sources - it seems to be stuck
in a closed feedback loop.

Tim Delaney

[toc] | [prev] | [next] | [standalone]


Page 1 of 7  [1] 2 3 4 5 6 7  Next page →

Back to top | Article view | comp.lang.python


csiph-web