Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #50503 > unrolled thread
| Started by | Devyn Collier Johnson <devyncjohnson@gmail.com> |
|---|---|
| First post | 2013-07-11 19:44 -0400 |
| Last post | 2013-07-18 13:17 -0700 |
| Articles | 20 on this page of 136 — 25 participants |
Back to article view | Back to comp.lang.python
RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-11 19:44 -0400
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-12 02:23 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-12 19:27 +1000
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-12 10:39 +0100
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-12 19:40 +1000
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-12 06:45 -0400
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-12 16:59 +0100
Re: RE Module Performance Peter Otten <__peter__@web.de> - 2013-07-12 18:15 +0200
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-13 02:21 +1000
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-12 13:58 -0400
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-13 05:37 +0000
Re: RE Module Performance 88888 Dihedral <dihedral88888@gmail.com> - 2013-07-14 11:17 -0700
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-15 06:06 -0400
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-15 12:36 +0000
Dihedral Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-15 08:52 -0400
Re: Dihedral Joel Goldstick <joel.goldstick@gmail.com> - 2013-07-15 09:03 -0400
Re: Dihedral Wayne Werner <wayne@waynewerner.com> - 2013-07-15 17:43 -0500
Re: Dihedral Fábio Santos <fabiosantosart@gmail.com> - 2013-07-15 23:54 +0100
Re: Dihedral Chris Angelico <rosuav@gmail.com> - 2013-07-16 08:59 +1000
Re: Dihedral Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-16 16:06 +1000
Re: Dihedral Stefan Behnel <stefan_ml@behnel.de> - 2013-07-24 20:08 +0200
Re: Dihedral Chris Angelico <rosuav@gmail.com> - 2013-07-25 04:23 +1000
Re: Dihedral Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-07-24 20:15 -0400
Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-13 08:16 +1000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-12 17:13 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-24 06:40 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-24 23:48 +1000
Re: RE Module Performance David Hutto <dwightdhutto@gmail.com> - 2013-07-24 10:17 -0400
Re: RE Module Performance David Hutto <dwightdhutto@gmail.com> - 2013-07-24 10:19 -0400
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 00:34 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 07:02 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 17:39 +1000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 08:47 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-25 02:27 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 20:14 +1000
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-25 12:07 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 05:18 +1000
RE: RE Module Performance "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2013-07-25 19:30 +0000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-25 21:06 -0600
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 09:00 -0600
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 05:56 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 00:56 +1000
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-24 13:52 -0400
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 04:15 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 07:15 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 17:58 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 09:22 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 20:07 +1000
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-24 18:09 -0400
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 08:19 +1000
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-24 16:59 -0600
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 09:24 +1000
Re: RE Module Performance Serhiy Storchaka <storchaka@gmail.com> - 2013-07-25 08:49 +0300
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-25 15:58 +1000
Re: RE Module Performance Jeremy Sanders <jeremy@jeremysanders.net> - 2013-07-25 14:36 +0100
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 15:26 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 01:36 +1000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-25 17:18 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-26 03:27 +1000
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 15:45 -0500
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-26 02:48 +0000
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 21:20 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:36 -0700
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 08:46 -0700
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 06:28 +0000
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 03:37 +0000
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-26 22:12 -0600
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-27 05:04 +0000
Re: RE Module Performance Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2013-07-27 12:13 -0400
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:19 -0700
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-25 21:09 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-26 06:21 -0700
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-26 20:05 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-27 11:21 -0700
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-27 21:53 -0600
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 11:13 -0700
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-28 20:04 +0100
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 12:30 -0700
Re: RE Module Performance Lele Gaifax <lele@metapensiero.it> - 2013-07-28 22:45 +0200
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 22:01 +0200
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-30 07:01 -0700
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 16:38 +0200
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-30 15:45 +0100
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-30 17:13 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 18:39 +0200
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-30 18:14 +0100
Re: RE Module Performance Neil Hodgson <nhodgson@iinet.net.au> - 2013-07-31 13:09 +1000
Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-31 03:27 +1000
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-30 18:40 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-30 20:19 +0200
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-30 12:09 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-30 21:04 +0100
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-30 21:54 -0600
Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-31 05:45 +0000
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-31 08:17 +0100
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-31 13:15 -0700
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-31 21:41 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 10:11 +0200
Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-31 01:32 -0700
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 10:59 +0200
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-31 08:44 -0600
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-30 17:05 -0400
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-30 21:30 -0600
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-31 09:23 +0200
Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-31 08:27 -0600
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 10:45 +0200
FSR and unicode compliance - was Re: RE Module Performance Michael Torrie <torriem@gmail.com> - 2013-07-28 09:52 -0600
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-28 12:23 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-28 20:44 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 21:55 +0200
Re: FSR and unicode compliance - was Re: RE Module Performance Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-07-28 20:52 +0000
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 04:43 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 12:57 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 05:56 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 07:20 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 15:49 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 09:31 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Heiko Wundram <modelnine@modelnine.org> - 2013-07-29 14:06 +0200
Re: FSR and unicode compliance - was Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-29 08:43 -0400
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 18:03 +0100
Re: FSR and unicode compliance - was Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-28 13:36 -0400
Re: FSR and unicode compliance - was Re: RE Module Performance wxjmfauth@gmail.com - 2013-07-29 06:36 -0700
Re: FSR and unicode compliance - was Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 19:03 +0100
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-28 19:19 +0100
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-28 19:29 +0100
Re: RE Module Performance Terry Reedy <tjreedy@udel.edu> - 2013-07-28 15:06 -0400
Re: RE Module Performance Joshua Landau <joshua@landau.ws> - 2013-07-28 23:14 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-28 20:51 +0200
Re: RE Module Performance Chris Angelico <rosuav@gmail.com> - 2013-07-29 00:07 +0100
Re: RE Module Performance Antoon Pardon <antoon.pardon@rece.vub.ac.be> - 2013-07-26 22:38 +0200
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-25 09:44 -0400
Re: RE Module Performance Ian Kelly <ian.g.kelly@gmail.com> - 2013-07-25 15:53 -0500
Re: RE Module Performance MRAB <python@mrabarnett.plus.com> - 2013-07-13 00:16 +0100
Re: RE Module Performance Tim Delaney <timothy.c.delaney@gmail.com> - 2013-07-14 05:34 +1000
Re: RE Module Performance Devyn Collier Johnson <devyncjohnson@gmail.com> - 2013-07-16 06:30 -0400
Re: RE Module Performance 88888 Dihedral <dihedral88888@gmail.com> - 2013-07-18 13:17 -0700
Page 1 of 7 [1] 2 3 4 5 6 7 Next page →
| From | Devyn Collier Johnson <devyncjohnson@gmail.com> |
|---|---|
| Date | 2013-07-11 19:44 -0400 |
| Subject | RE Module Performance |
| Message-ID | <mailman.4618.1373613834.3114.python-list@python.org> |
I recently saw an email in this mailing list about the RE module being made slower. I no long have that email. However, I have viewed the source for the RE module, but I did not see any code that would slow down the script for no valid reason. Can anyone explain what that user meant or if I missed that part of the module? Can the RE module be optimized in any way or at least the "re.sub" portion? Mahalo, Devyn Collier Johnson DevynCJohnson@Gmail.com
[toc] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2013-07-12 02:23 -0700 |
| Message-ID | <571a6dfe-fd66-42cf-92fc-8b97cbe6e9e4@googlegroups.com> |
| In reply to | #50503 |
Le vendredi 12 juillet 2013 01:44:05 UTC+2, Devyn Collier Johnson a écrit : > I recently saw an email in this mailing list about the RE module being > > made slower. I no long have that email. However, I have viewed the > > source for the RE module, but I did not see any code that would slow > > down the script for no valid reason. Can anyone explain what that user > > meant or if I missed that part of the module? > > > > Can the RE module be optimized in any way or at least the "re.sub" portion? > > > > Mahalo, > > > > Devyn Collier Johnson > > DevynCJohnson@Gmail.com ---------- I would not care too much about the performance of re. With the new Flexible String Representation, you can use a logarithmic scale to compare re results. To be honest, there is improvment if you are an ascii user. Am I the only one who tested this? Probably. jmf
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-07-12 19:27 +1000 |
| Message-ID | <mailman.4624.1373621267.3114.python-list@python.org> |
| In reply to | #50510 |
On Fri, Jul 12, 2013 at 7:23 PM, <wxjmfauth@gmail.com> wrote: > > I would not care too much about the performance > of re. > > With the new Flexible String Representation, you > can use a logarithmic scale to compare re results. > To be honest, there is improvment if you are an > ascii user. > > Am I the only one who tested this? Probably. Am I the only one who thinks that Dihedral posted under jmf's name? ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Joshua Landau <joshua@landau.ws> |
|---|---|
| Date | 2013-07-12 10:39 +0100 |
| Message-ID | <mailman.4625.1373621984.3114.python-list@python.org> |
| In reply to | #50510 |
On 12 July 2013 10:27, Chris Angelico <rosuav@gmail.com> wrote: > On Fri, Jul 12, 2013 at 7:23 PM, <wxjmfauth@gmail.com> wrote: >> >> I would not care too much about the performance >> of re. >> >> With the new Flexible String Representation, you >> can use a logarithmic scale to compare re results. >> To be honest, there is improvment if you are an >> ascii user. >> >> Am I the only one who tested this? Probably. > > Am I the only one who thinks that Dihedral posted under jmf's name? A bot posting as a troll to troll a different troll? Meta.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-07-12 19:40 +1000 |
| Message-ID | <mailman.4626.1373622056.3114.python-list@python.org> |
| In reply to | #50510 |
On Fri, Jul 12, 2013 at 7:39 PM, Joshua Landau <joshua@landau.ws> wrote: > On 12 July 2013 10:27, Chris Angelico <rosuav@gmail.com> wrote: >> On Fri, Jul 12, 2013 at 7:23 PM, <wxjmfauth@gmail.com> wrote: >>> >>> I would not care too much about the performance >>> of re. >>> >>> With the new Flexible String Representation, you >>> can use a logarithmic scale to compare re results. >>> To be honest, there is improvment if you are an >>> ascii user. >>> >>> Am I the only one who tested this? Probably. >> >> Am I the only one who thinks that Dihedral posted under jmf's name? > > A bot posting as a troll to troll a different troll? Meta. Yeah, it is. But the only other explanation is that jmf has become rather more incomprehensible than usual. Normally I can understand what he's complaining enough to refute it, but here I feel like I'm responding to Dihedral. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Devyn Collier Johnson <devyncjohnson@gmail.com> |
|---|---|
| Date | 2013-07-12 06:45 -0400 |
| Message-ID | <mailman.4639.1373643904.3114.python-list@python.org> |
| In reply to | #50510 |
Could you explain what you mean? What and where is the new Flexible String Representation? Devyn Collier Johnson On 07/12/2013 05:23 AM, wxjmfauth@gmail.com wrote: > Le vendredi 12 juillet 2013 01:44:05 UTC+2, Devyn Collier Johnson a écrit : >> I recently saw an email in this mailing list about the RE module being >> >> made slower. I no long have that email. However, I have viewed the >> >> source for the RE module, but I did not see any code that would slow >> >> down the script for no valid reason. Can anyone explain what that user >> >> meant or if I missed that part of the module? >> >> >> >> Can the RE module be optimized in any way or at least the "re.sub" portion? >> >> >> >> Mahalo, >> >> >> >> Devyn Collier Johnson >> >> DevynCJohnson@Gmail.com > ---------- > > I would not care too much about the performance > of re. > > With the new Flexible String Representation, you > can use a logarithmic scale to compare re results. > To be honest, there is improvment if you are an > ascii user. > > Am I the only one who tested this? Probably. > > jmf > >
[toc] | [prev] | [next] | [standalone]
| From | Joshua Landau <joshua@landau.ws> |
|---|---|
| Date | 2013-07-12 16:59 +0100 |
| Message-ID | <mailman.4643.1373644803.3114.python-list@python.org> |
| In reply to | #50510 |
On 12 July 2013 11:45, Devyn Collier Johnson <devyncjohnson@gmail.com> wrote: > Could you explain what you mean? What and where is the new Flexible String > Representation? Do not worry. jmf is on about his old rant comparing broken previous versions of Python to newer ones which in some microbenchmarks are slower. I don't really get why he spends his time on it. If you're interested, the basic of it is that strings now use a variable number of bytes to encode their values depending on whether values outside of the ASCII range and some other range are used, as an optimisation.
[toc] | [prev] | [next] | [standalone]
| From | Peter Otten <__peter__@web.de> |
|---|---|
| Date | 2013-07-12 18:15 +0200 |
| Message-ID | <mailman.4645.1373645725.3114.python-list@python.org> |
| In reply to | #50510 |
Joshua Landau wrote: > On 12 July 2013 11:45, Devyn Collier Johnson <devyncjohnson@gmail.com> > wrote: >> Could you explain what you mean? What and where is the new Flexible >> String Representation? > > Do not worry. jmf is on about his old rant comparing broken previous > versions of Python to newer ones which in some microbenchmarks are > slower. I don't really get why he spends his time on it. > > If you're interested, the basic of it is that strings now use a > variable number of bytes to encode their values depending on whether > values outside of the ASCII range and some other range are used, as an > optimisation. See also <http://www.python.org/dev/peps/pep-0393/>
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-07-13 02:21 +1000 |
| Message-ID | <mailman.4646.1373646092.3114.python-list@python.org> |
| In reply to | #50510 |
On Fri, Jul 12, 2013 at 8:45 PM, Devyn Collier Johnson
<devyncjohnson@gmail.com> wrote:
> Could you explain what you mean? What and where is the new Flexible String
> Representation?
(You're top-posting again. Please put your text underneath what you're
responding to - it helps maintain flow and structure.)
Python versions up to and including 3.2 came in two varieties: narrow
builds (commonly found on Windows) and wide builds (commonly found on
Linux). Narrow builds internally represented Unicode strings in
UTF-16, while wide builds used UTF-32. This is a problem, because it
means that taking a program from one to another actually changes its
behaviour:
Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> len(u"\U00012345")
1
Python 2.7.4 (default, Apr 6 2013, 19:54:46) [MSC v.1500 32 bit
(Intel)] on win32
>>> len(u"\U00012345")
2
In fact, the narrow builds are flat-out buggy, because you can put
something in as a single character that simply isn't a single
character. You can then pull that out as two characters and make a
huge mess of things:
>>> s=u"\U00012345"
>>> s[0]
u'\ud808'
>>> s[1]
u'\udf45'
*Any* string indexing will be broken if there is a single character
>U+FFFF ahead of it in the string.
Now, this problem is not unique to Python. Heaps of other languages
have the same issue, the same buggy behaviour, the same compromises.
What's special about Python is that it actually managed to come back
from that problem. (Google's V8 JavaScript engine, for instance, is
stuck with it, because the ECMAScript specification demands UTF-16. I
asked on an ECMAScript list and was told "can't change that, it'd
break code". So it's staying buggy.)
There are a number of languages that take the Texan RAM-guzzling
approach of storing all strings in UTF-32; Python (since version 3.3)
is among a *very* small number of languages that store strings in
multiple different ways according to their content. That's described
in PEP 393 [1], titled "Flexible String Representation". It details a
means whereby a Python string will be represented in, effectively,
UTF-32 with some of the leading zero bytes elided. Or if you prefer,
in either Latin-1, UCS-2, or UCS-4, whichever's the smallest it can
fit in. The difference between a string stored one-byte-per-character
and a string stored four-bytes-per-character is almost invisible to a
Python script; you can find out by checking the string's memory usage,
but otherwise you don't need to worry about it.
Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
32 bit (Intel)] on win32
>>> sys.getsizeof("asdfasdfasdfasd")
40
>>> sys.getsizeof("asdfasdfasdfasdf")
41
Adding another character adds another 1 byte. (There's quite a bit of
overhead for small strings - GC headers and such - but it gets dwarfed
by the actual content after a while.)
>>> sys.getsizeof("\u1000sdfasdfasdfasd")
68
>>> sys.getsizeof("\u1000sdfasdfasdfasdf")
70
Two bytes to add another character.
>>> sys.getsizeof("\U00010001sdfasdfasdfasd")
100
>>> sys.getsizeof("\U00010001sdfasdfasdfasdf")
104
Four bytes. It uses only what it needs.
Strings in Python are immutable, so there's no need to worry about
up-grading or down-grading a string; there are a few optimizations
that can't be done, but they're fairly trivial. Look, I'll pull a jmf
and find a microbenchmark that makes 3.3 look worse:
2.7.4:
>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
[0.8175005482540385, 0.789617954237201, 0.8152240019332098]
>>> timeit.repeat('a=u"A"*100; a+=u"a"')
[0.8088905154146744, 0.8123691698246631, 0.8172558244134365]
3.3.0:
>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
[0.9623714745976031, 0.970628669281723, 0.9696310564468149]
>>> timeit.repeat('a=u"A"*100; a+=u"a"')
[0.7017891938739922, 0.7024725209339522, 0.6989539173082449]
See? It's clearly worse on the newer Python! But actually, this is an
extremely unusual situation, and 3.3 outperforms 2.7 on the more
common case (where the two strings are of the same width).
Python's PEP 393 strings are following the same sort of model as the
native string type in a semantically-similar but
syntactically-different language, Pike. In Pike (also free software,
like Python), the string type can be indexed character by character,
and each character can be anything in the Unicode range; and just as
in Python 3.3, memory usage goes up by just one byte if every
character in the string fits inside 8 bits. So it's not as if this is
an untested notion; Pike has been running like this for years (I don't
know how long it's had this functionality, but it won't be more than
18 years as Unicode didn't have multiple planes until 1996), and
performance has been *just fine* for all that time. Pike tends to be
run on servers, so memory usage and computation speed translate fairly
directly into TPS. And there are some sizeable commercial entities
using and developing Pike, so if the flexible string representation
had turned out to be a flop, someone would have put in the coding time
to rewrite it by now.
And yet, despite all these excellent reasons for moving to this way of
doing strings, jmf still sees his microbenchmarks as more important,
and so he jumps in on threads like this to whine about how Python 3.3
is somehow US-centric because it more efficiently handles the entire
Unicode range. I'd really like to take some highlights from Python and
Pike and start recommending that other languages take up the ideas,
but to be honest, I hesitate to inflict jmf on them all. ECMAScript
may have the right idea after all - stay with UTF-16 and avoid
answering jmf's stupid objections every week.
[1] http://www.python.org/dev/peps/pep-0393/
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Devyn Collier Johnson <devyncjohnson@gmail.com> |
|---|---|
| Date | 2013-07-12 13:58 -0400 |
| Message-ID | <mailman.4656.1373657874.3114.python-list@python.org> |
| In reply to | #50510 |
On 07/12/2013 12:21 PM, Chris Angelico wrote:
> On Fri, Jul 12, 2013 at 8:45 PM, Devyn Collier Johnson
> <devyncjohnson@gmail.com> wrote:
>> Could you explain what you mean? What and where is the new Flexible String
>> Representation?
> (You're top-posting again. Please put your text underneath what you're
> responding to - it helps maintain flow and structure.)
>
> Python versions up to and including 3.2 came in two varieties: narrow
> builds (commonly found on Windows) and wide builds (commonly found on
> Linux). Narrow builds internally represented Unicode strings in
> UTF-16, while wide builds used UTF-32. This is a problem, because it
> means that taking a program from one to another actually changes its
> behaviour:
>
> Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15)
> [GCC 4.4.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> len(u"\U00012345")
> 1
>
> Python 2.7.4 (default, Apr 6 2013, 19:54:46) [MSC v.1500 32 bit
> (Intel)] on win32
>>>> len(u"\U00012345")
> 2
>
> In fact, the narrow builds are flat-out buggy, because you can put
> something in as a single character that simply isn't a single
> character. You can then pull that out as two characters and make a
> huge mess of things:
>
>>>> s=u"\U00012345"
>>>> s[0]
> u'\ud808'
>>>> s[1]
> u'\udf45'
>
> *Any* string indexing will be broken if there is a single character
>> U+FFFF ahead of it in the string.
> Now, this problem is not unique to Python. Heaps of other languages
> have the same issue, the same buggy behaviour, the same compromises.
> What's special about Python is that it actually managed to come back
> from that problem. (Google's V8 JavaScript engine, for instance, is
> stuck with it, because the ECMAScript specification demands UTF-16. I
> asked on an ECMAScript list and was told "can't change that, it'd
> break code". So it's staying buggy.)
>
> There are a number of languages that take the Texan RAM-guzzling
> approach of storing all strings in UTF-32; Python (since version 3.3)
> is among a *very* small number of languages that store strings in
> multiple different ways according to their content. That's described
> in PEP 393 [1], titled "Flexible String Representation". It details a
> means whereby a Python string will be represented in, effectively,
> UTF-32 with some of the leading zero bytes elided. Or if you prefer,
> in either Latin-1, UCS-2, or UCS-4, whichever's the smallest it can
> fit in. The difference between a string stored one-byte-per-character
> and a string stored four-bytes-per-character is almost invisible to a
> Python script; you can find out by checking the string's memory usage,
> but otherwise you don't need to worry about it.
>
> Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600
> 32 bit (Intel)] on win32
>>>> sys.getsizeof("asdfasdfasdfasd")
> 40
>>>> sys.getsizeof("asdfasdfasdfasdf")
> 41
>
> Adding another character adds another 1 byte. (There's quite a bit of
> overhead for small strings - GC headers and such - but it gets dwarfed
> by the actual content after a while.)
>
>>>> sys.getsizeof("\u1000sdfasdfasdfasd")
> 68
>>>> sys.getsizeof("\u1000sdfasdfasdfasdf")
> 70
>
> Two bytes to add another character.
>
>>>> sys.getsizeof("\U00010001sdfasdfasdfasd")
> 100
>>>> sys.getsizeof("\U00010001sdfasdfasdfasdf")
> 104
>
> Four bytes. It uses only what it needs.
>
> Strings in Python are immutable, so there's no need to worry about
> up-grading or down-grading a string; there are a few optimizations
> that can't be done, but they're fairly trivial. Look, I'll pull a jmf
> and find a microbenchmark that makes 3.3 look worse:
>
> 2.7.4:
>>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
> [0.8175005482540385, 0.789617954237201, 0.8152240019332098]
>>>> timeit.repeat('a=u"A"*100; a+=u"a"')
> [0.8088905154146744, 0.8123691698246631, 0.8172558244134365]
>
> 3.3.0:
>>>> timeit.repeat('a=u"A"*100; a+=u"\u1000"')
> [0.9623714745976031, 0.970628669281723, 0.9696310564468149]
>>>> timeit.repeat('a=u"A"*100; a+=u"a"')
> [0.7017891938739922, 0.7024725209339522, 0.6989539173082449]
>
> See? It's clearly worse on the newer Python! But actually, this is an
> extremely unusual situation, and 3.3 outperforms 2.7 on the more
> common case (where the two strings are of the same width).
>
> Python's PEP 393 strings are following the same sort of model as the
> native string type in a semantically-similar but
> syntactically-different language, Pike. In Pike (also free software,
> like Python), the string type can be indexed character by character,
> and each character can be anything in the Unicode range; and just as
> in Python 3.3, memory usage goes up by just one byte if every
> character in the string fits inside 8 bits. So it's not as if this is
> an untested notion; Pike has been running like this for years (I don't
> know how long it's had this functionality, but it won't be more than
> 18 years as Unicode didn't have multiple planes until 1996), and
> performance has been *just fine* for all that time. Pike tends to be
> run on servers, so memory usage and computation speed translate fairly
> directly into TPS. And there are some sizeable commercial entities
> using and developing Pike, so if the flexible string representation
> had turned out to be a flop, someone would have put in the coding time
> to rewrite it by now.
>
> And yet, despite all these excellent reasons for moving to this way of
> doing strings, jmf still sees his microbenchmarks as more important,
> and so he jumps in on threads like this to whine about how Python 3.3
> is somehow US-centric because it more efficiently handles the entire
> Unicode range. I'd really like to take some highlights from Python and
> Pike and start recommending that other languages take up the ideas,
> but to be honest, I hesitate to inflict jmf on them all. ECMAScript
> may have the right idea after all - stay with UTF-16 and avoid
> answering jmf's stupid objections every week.
>
> [1] http://www.python.org/dev/peps/pep-0393/
>
> ChrisA
Thanks for the thorough response. I learned a lot. You should write
articles on Python.
I plan to spend some time optimizing the re.py module for Unix systems.
I would love to amp up my programs that use that module.
Devyn Collier Johnson
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-07-13 05:37 +0000 |
| Message-ID | <51e0e7aa$0$9505$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #50552 |
On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:
> I plan to spend some time optimizing the re.py module for Unix systems.
> I would love to amp up my programs that use that module.
In my experience, often the best way to optimize a regex is to not use it
at all.
[steve@ando ~]$ python -m timeit -s "import re" \
> -s "data = 'a'*100+'b'" \
> "if re.search('b', data): pass"
100000 loops, best of 3: 2.77 usec per loop
[steve@ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
> "if 'b' in data: pass"
1000000 loops, best of 3: 0.219 usec per loop
In Python, we often use plain string operations instead of regex-based
solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use
them for cracking peanuts.
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | 88888 Dihedral <dihedral88888@gmail.com> |
|---|---|
| Date | 2013-07-14 11:17 -0700 |
| Message-ID | <9a97b618-3e80-4149-9155-14fb210a0758@googlegroups.com> |
| In reply to | #50571 |
On Saturday, July 13, 2013 1:37:46 PM UTC+8, Steven D'Aprano wrote:
> On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:
>
>
>
> > I plan to spend some time optimizing the re.py module for Unix systems.
>
> > I would love to amp up my programs that use that module.
>
>
>
> In my experience, often the best way to optimize a regex is to not use it
>
> at all.
>
>
>
> [steve@ando ~]$ python -m timeit -s "import re" \
>
> > -s "data = 'a'*100+'b'" \
>
> > "if re.search('b', data): pass"
>
> 100000 loops, best of 3: 2.77 usec per loop
>
>
>
> [steve@ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
>
> > "if 'b' in data: pass"
>
> 1000000 loops, best of 3: 0.219 usec per loop
>
>
>
> In Python, we often use plain string operations instead of regex-based
>
> solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use
>
> them for cracking peanuts.
>
>
>
>
>
>
>
> --
>
> Steven
OK, lets talk about the indexed search algorithms of
a character streamor strig which can be buffered and
indexed randomly for RW operations but faster in sequential
block RW operations after some pre-processing.
This was solved long time ago in the suffix array or
suffix tree part and summarized in the famous BWT paper in 199X.
Do we want volunteers to speed up
search operations in the string module in Python?
[toc] | [prev] | [next] | [standalone]
| From | Devyn Collier Johnson <devyncjohnson@gmail.com> |
|---|---|
| Date | 2013-07-15 06:06 -0400 |
| Message-ID | <mailman.4710.1373882772.3114.python-list@python.org> |
| In reply to | #50647 |
On 07/14/2013 02:17 PM, 88888 Dihedral wrote:
> On Saturday, July 13, 2013 1:37:46 PM UTC+8, Steven D'Aprano wrote:
>> On Fri, 12 Jul 2013 13:58:29 -0400, Devyn Collier Johnson wrote:
>>
>>
>>
>>> I plan to spend some time optimizing the re.py module for Unix systems.
>>> I would love to amp up my programs that use that module.
>>
>>
>> In my experience, often the best way to optimize a regex is to not use it
>>
>> at all.
>>
>>
>>
>> [steve@ando ~]$ python -m timeit -s "import re" \
>>
>>> -s "data = 'a'*100+'b'" \
>>> "if re.search('b', data): pass"
>> 100000 loops, best of 3: 2.77 usec per loop
>>
>>
>>
>> [steve@ando ~]$ python -m timeit -s "data = 'a'*100+'b'" \
>>
>>> "if 'b' in data: pass"
>> 1000000 loops, best of 3: 0.219 usec per loop
>>
>>
>>
>> In Python, we often use plain string operations instead of regex-based
>>
>> solutions for basic tasks. Regexes are a 10lb sledge hammer. Don't use
>>
>> them for cracking peanuts.
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Steven
> OK, lets talk about the indexed search algorithms of
> a character streamor strig which can be buffered and
> indexed randomly for RW operations but faster in sequential
> block RW operations after some pre-processing.
>
> This was solved long time ago in the suffix array or
> suffix tree part and summarized in the famous BWT paper in 199X.
>
> Do we want volunteers to speed up
> search operations in the string module in Python?
It would be nice if someone could speed it up.
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-07-15 12:36 +0000 |
| Message-ID | <51e3ecbd$0$9505$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #50666 |
On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote: > On 07/14/2013 02:17 PM, 88888 Dihedral wrote: [...] >> Do we want volunteers to speed up >> search operations in the string module in Python? > > It would be nice if someone could speed it up. Devyn, 88888 Dihedral is our resident bot, not a human being. Nobody knows who controls it, and why they are running it, but we are pretty certain that it is a bot responding mechanically to keywords in people's posts. It's a very clever bot, but still a bot. About one post in four is meaningless jargon, the other three are relevant enough to fool people into thinking that maybe it is a human being. It had me fooled for a long time. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Devyn Collier Johnson <devyncjohnson@gmail.com> |
|---|---|
| Date | 2013-07-15 08:52 -0400 |
| Subject | Dihedral |
| Message-ID | <mailman.4721.1373892779.3114.python-list@python.org> |
| In reply to | #50675 |
On 07/15/2013 08:36 AM, Steven D'Aprano wrote: > On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote: > >> On 07/14/2013 02:17 PM, 88888 Dihedral wrote: > [...] >>> Do we want volunteers to speed up >>> search operations in the string module in Python? >> It would be nice if someone could speed it up. > Devyn, > > 88888 Dihedral is our resident bot, not a human being. Nobody knows who > controls it, and why they are running it, but we are pretty certain that > it is a bot responding mechanically to keywords in people's posts. > > It's a very clever bot, but still a bot. About one post in four is > meaningless jargon, the other three are relevant enough to fool people > into thinking that maybe it is a human being. It had me fooled for a long > time. > > > Wow! Our mailing list has a pet bot. I bet other mailing lists are so jealous of us. Who ever created Dihedral is a genius! Artificial Intelligence developers put chatbots on mailing lists so that the program can learn. I use Python3 to program AI applications. If you see my Launchpad account, you will see my two AI projects - Neobot and Novabot. (https://launchpad.net/neobot Neo and Nova are still unstable) AI developers let their bots loose on the Internet to learn from people. Dihedral is learning from us. Dihedral only responses when it feels it has sufficient knowledge on the topic. Chatbots want to appear human. That is their goal. We should feel honored that Dihedral's botmaster feels that this mailinglist would benefit the development of Dihedral's knowledge. Devyn Collier Johnson
[toc] | [prev] | [next] | [standalone]
| From | Joel Goldstick <joel.goldstick@gmail.com> |
|---|---|
| Date | 2013-07-15 09:03 -0400 |
| Subject | Re: Dihedral |
| Message-ID | <mailman.4728.1373893408.3114.python-list@python.org> |
| In reply to | #50675 |
[Multipart message — attachments visible in raw view] — view raw
On Mon, Jul 15, 2013 at 8:52 AM, Devyn Collier Johnson < devyncjohnson@gmail.com> wrote: > > On 07/15/2013 08:36 AM, Steven D'Aprano wrote: > >> On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote: >> >> On 07/14/2013 02:17 PM, 88888 Dihedral wrote: >>> >> [...] >> >>> Do we want volunteers to speed up >>>> search operations in the string module in Python? >>>> >>> It would be nice if someone could speed it up. >>> >> Devyn, >> >> 88888 Dihedral is our resident bot, not a human being. Nobody knows who >> controls it, and why they are running it, but we are pretty certain that >> it is a bot responding mechanically to keywords in people's posts. >> >> It's a very clever bot, but still a bot. About one post in four is >> meaningless jargon, the other three are relevant enough to fool people >> into thinking that maybe it is a human being. It had me fooled for a long >> time. >> >> >> >> Wow! Our mailing list has a pet bot. I bet other mailing lists are so > jealous of us. Who ever created Dihedral is a genius! > > Artificial Intelligence developers put chatbots on mailing lists so that > the program can learn. I use Python3 to program AI applications. If you see > my Launchpad account, you will see my two AI projects - Neobot and Novabot. > (https://launchpad.net/neobot Neo and Nova are still unstable) AI > developers let their bots loose on the Internet to learn from people. > Dihedral is learning from us. Dihedral only responses when it feels it has > sufficient knowledge on the topic. Chatbots want to appear human. That is > their goal. We should feel honored that Dihedral's botmaster feels that > this mailinglist would benefit the development of Dihedral's knowledge. > > Devyn Collier Johnson > -- > http://mail.python.org/**mailman/listinfo/python-list<http://mail.python.org/mailman/listinfo/python-list> > I particularly enjoy the misspellings, that seem to be such a human quality on email messages! -- Joel Goldstick http://joelgoldstick.com
[toc] | [prev] | [next] | [standalone]
| From | Wayne Werner <wayne@waynewerner.com> |
|---|---|
| Date | 2013-07-15 17:43 -0500 |
| Subject | Re: Dihedral |
| Message-ID | <mailman.4747.1373928202.3114.python-list@python.org> |
| In reply to | #50675 |
On Mon, 15 Jul 2013, Devyn Collier Johnson wrote: > > On 07/15/2013 08:36 AM, Steven D'Aprano wrote: >> On Mon, 15 Jul 2013 06:06:06 -0400, Devyn Collier Johnson wrote: >> >>> On 07/14/2013 02:17 PM, 88888 Dihedral wrote: >> [...] >>>> Do we want volunteers to speed up >>>> search operations in the string module in Python? >>> It would be nice if someone could speed it up. >> Devyn, >> >> 88888 Dihedral is our resident bot, not a human being. Nobody knows who >> controls it, and why they are running it, but we are pretty certain that >> it is a bot responding mechanically to keywords in people's posts. >> >> It's a very clever bot, but still a bot. About one post in four is >> meaningless jargon, the other three are relevant enough to fool people >> into thinking that maybe it is a human being. It had me fooled for a long >> time. >> >> >> > Wow! Our mailing list has a pet bot. I bet other mailing lists are so jealous > of us. Who ever created Dihedral is a genius! > > Artificial Intelligence developers put chatbots on mailing lists so that the > program can learn. I use Python3 to program AI applications. If you see my > Launchpad account, you will see my two AI projects - Neobot and Novabot. > (https://launchpad.net/neobot Neo and Nova are still unstable) AI developers > let their bots loose on the Internet to learn from people. Dihedral is > learning from us. Dihedral only responses when it feels it has sufficient > knowledge on the topic. Chatbots want to appear human. That is their goal. We > should feel honored that Dihedral's botmaster feels that this mailinglist > would benefit the development of Dihedral's knowledge. Are *you* a bot? ~_^ That post felt surprisingly like Dihedral... -W
[toc] | [prev] | [next] | [standalone]
| From | Fábio Santos <fabiosantosart@gmail.com> |
|---|---|
| Date | 2013-07-15 23:54 +0100 |
| Subject | Re: Dihedral |
| Message-ID | <mailman.4749.1373928846.3114.python-list@python.org> |
| In reply to | #50675 |
[Multipart message — attachments visible in raw view] — view raw
> On 07/15/2013 08:36 AM, Steven D'Aprano wrote: >> >> Devyn, >> >> 88888 Dihedral is our resident bot, not a human being. Nobody knows who >> controls it, and why they are running it, but we are pretty certain that >> it is a bot responding mechanically to keywords in people's posts. >> >> It's a very clever bot, but still a bot. About one post in four is >> meaningless jargon, the other three are relevant enough to fool people >> into thinking that maybe it is a human being. It had me fooled for a long >> time. >> Does this mean he passes the Turing test?
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-07-16 08:59 +1000 |
| Subject | Re: Dihedral |
| Message-ID | <mailman.4750.1373929147.3114.python-list@python.org> |
| In reply to | #50675 |
On Tue, Jul 16, 2013 at 8:54 AM, Fábio Santos <fabiosantosart@gmail.com> wrote: > >> On 07/15/2013 08:36 AM, Steven D'Aprano wrote: >>> >>> Devyn, >>> >>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who >>> controls it, and why they are running it, but we are pretty certain that >>> it is a bot responding mechanically to keywords in people's posts. >>> >>> It's a very clever bot, but still a bot. About one post in four is >>> meaningless jargon, the other three are relevant enough to fool people >>> into thinking that maybe it is a human being. It had me fooled for a long >>> time. >>> > > Does this mean he passes the Turing test? Yes, absolutely. The original Turing test was defined in terms of five minutes of analysis, and Dihedral and jmf have clearly been indistinguishably human across that approximate period. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Tim Delaney <timothy.c.delaney@gmail.com> |
|---|---|
| Date | 2013-07-16 16:06 +1000 |
| Subject | Re: Dihedral |
| Message-ID | <mailman.4756.1373954816.3114.python-list@python.org> |
| In reply to | #50675 |
[Multipart message — attachments visible in raw view] — view raw
On 16 July 2013 08:59, Chris Angelico <rosuav@gmail.com> wrote: > On Tue, Jul 16, 2013 at 8:54 AM, Fábio Santos <fabiosantosart@gmail.com> > wrote: > > > >> On 07/15/2013 08:36 AM, Steven D'Aprano wrote: > >>> > >>> Devyn, > >>> > >>> 88888 Dihedral is our resident bot, not a human being. Nobody knows who > >>> controls it, and why they are running it, but we are pretty certain > that > >>> it is a bot responding mechanically to keywords in people's posts. > >>> > >>> It's a very clever bot, but still a bot. About one post in four is > >>> meaningless jargon, the other three are relevant enough to fool people > >>> into thinking that maybe it is a human being. It had me fooled for a > long > >>> time. > >>> > > > > Does this mean he passes the Turing test? > > Yes, absolutely. The original Turing test was defined in terms of five > minutes of analysis, and Dihedral and jmf have clearly been > indistinguishably human across that approximate period. > The big difference between them is that the jmfbot does not appear to evolve its routines in response to external sources - it seems to be stuck in a closed feedback loop. Tim Delaney
[toc] | [prev] | [next] | [standalone]
Page 1 of 7 [1] 2 3 4 5 6 7 Next page →
Back to top | Article view | comp.lang.python
csiph-web