Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #41201
| Path | csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <rosuav@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.000 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'python,': 0.02; 'win32': 0.03; 'broken': 0.03; 'url:pipermail': 0.05; 'ascii': 0.07; 'indexing': 0.07; 'raised': 0.07; 'referring': 0.07; 'python': 0.09; 'before.': 0.09; 'issue:': 0.09; 'msi': 0.09; 'notation': 0.09; 'regression': 0.09; 'sep': 0.09; 'spec': 0.09; 'way:': 0.09; 'bug': 0.10; 'stored': 0.10; 'subject:python': 0.11; '2.7': 0.13; 'index': 0.13; '(var': 0.16; '3.2.': 0.16; '3.3,': 0.16; 'buggy': 0.16; 'build"': 0.16; 'expected,': 0.16; 'foo()': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'installer,': 0.16; 'semantically': 0.16; 'subject:3.3': 0.16; 'subject:String': 0.16; 'thread.': 0.16; 'unicode)': 0.16; 'why,': 0.16; 'wed,': 0.16; 'string': 0.17; 'wrote:': 0.17; 'basically': 0.17; 'fixed.': 0.17; 'instance,': 0.17; 'thu,': 0.17; 'unicode': 0.17; '>>>': 0.18; 'memory': 0.18; 'windows': 0.19; 'versions': 0.20; 'bit': 0.21; 'fairly': 0.21; '3.2': 0.22; "i'd": 0.22; 'split': 0.23; 'long,': 0.24; 'linux': 0.24; 'script': 0.24; 'header:In-Reply-To:1': 0.25; '(which': 0.26; 'common': 0.26; 'am,': 0.27; 'bugs': 0.27; '2.6': 0.27; 'see,': 0.27; 'message- id:@mail.gmail.com': 0.27; "doesn't": 0.28; 'chris': 0.28; 'character.': 0.29; 'represented': 0.29; 'character': 0.29; 'included': 0.29; "skip:' 10": 0.30; 'function': 0.30; 'up.': 0.31; 'code': 0.31; 'says': 0.33; 'builds': 0.33; 'impression': 0.33; 'skip:j 20': 0.33; 'ubuntu': 0.33; 'problem': 0.33; 'to:addr :python-list': 0.33; 'version': 0.34; "can't": 0.34; 'received:google.com': 0.34; 'list': 0.35; 'compared': 0.35; 'platforms,': 0.35; 'pm,': 0.35; 'too.': 0.35; 'there': 0.35; 'but': 0.36; 'url:org': 0.36; 'be.': 0.36; 'useful': 0.36; 'should': 0.36; 'possible': 0.37; 'skip:t 40': 0.37; 'does': 0.37; 'two': 0.37; 'being': 0.37; 'rather': 0.37; 'subject:: ': 0.38; 'mean': 0.38; 'some': 0.38; 'things': 0.38; '2010,': 0.38; 'performance': 0.39; 'to:addr:python.org': 0.39; 'build': 0.39; 'google': 0.39; 'little': 0.39; 'url:mail': 0.40; 'skip:u 10': 0.60; 'chance': 0.61; "you'll": 0.62; 'wide': 0.62; 'is.': 0.62; 'thomas': 0.62; 'upgrading': 0.62; 'virus:</script': 0.63; 'virus:<script': 0.63; 'different': 0.63; 'ever': 0.63; 'more': 0.63; 'replying': 0.64; 'making': 0.64; 'charset:windows-1252': 0.65; 'readers': 0.65; 'subject': 0.66; '(based': 0.84; '(oh': 0.84; '2013': 0.84; 'fortunately': 0.84; 'ships': 0.84; 'rusi': 0.91; 'url:mozilla': 0.91; 'python.org,': 0.93; 'wait,': 0.93 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=Sn9yLTyrJFlsxrbnKOSeoMZxuT0dfjC7nbVRO/JsCX0=; b=YRehzVCok98PMwL3W0wrxpF8Eq4fAnr7fxqIqi4E+MgFpaIhpgSL1ev/JdF8NsgFfX mkbQyP8Bk73Whkmjt9fyVVLw4FPquk78tjsNQuJI8aLPLk+qJ/zKYykP36tM6NBO5713 fGuIZ2iLhPY+nZci+5+0dBm/rZzeqK8GjPvYbw9e48hCd7Amn4r0+pGgptcBr/nXNG/s ENNnYZS6t5udwMNJ2h2/d+vZrltfHigfOX+ty19qTQSGIxfvWrUECoUJoqDOyQ9i49bN ZcPoNaApE3s7zYv+gZZHFjNy8NDBmKnHHIkt4FDZKIUJSTMvGVZodkCS6BeOGWCtLRN7 xbZw== |
| MIME-Version | 1.0 |
| X-Received | by 10.58.253.161 with SMTP id ab1mr129028ved.55.1363220351863; Wed, 13 Mar 2013 17:19:11 -0700 (PDT) |
| In-Reply-To | <2992273.neLn1eVAPo@PointedEars.de> |
| References | <23a42297-9262-4ace-87ad-138999b1ddd6@z3g2000vbg.googlegroups.com> <a1a6394a-e9c7-407b-9f6d-ff44de1b65de@y2g2000pbg.googlegroups.com> <eabe27a9-099a-4e2c-92fb-bdf3819c2561@kw7g2000pbb.googlegroups.com> <mailman.3259.1363172350.2939.python-list@python.org> <2992273.neLn1eVAPo@PointedEars.de> |
| Date | Thu, 14 Mar 2013 11:19:11 +1100 |
| Subject | Re: String performance regression from python 3.2 to 3.3 |
| From | Chris Angelico <rosuav@gmail.com> |
| To | python-list@python.org |
| Content-Type | text/plain; charset=windows-1252 |
| Content-Transfer-Encoding | quoted-printable |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3278.1363220353.2939.python-list@python.org> (permalink) |
| Lines | 97 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1363220353 news.xs4all.nl 6939 [2001:888:2000:d::a6]:42229 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:41201 |
Show key headers only | View raw
On Thu, Mar 14, 2013 at 4:42 AM, Thomas 'PointedEars' Lahn
<PointedEars@web.de> wrote:
> Chris Angelico wrote:
>
>> On Wed, Mar 13, 2013 at 9:11 PM, rusi <rustompmody@gmail.com> wrote:
>>> Uhhh..
>>> Making the subject line useful for all readers
>>
>> I should have read this one before replying in the other thread.
>>
>> jmf, I'd like to see evidence that there has been a performance
>> regression compared against a wide build of Python 3.2. You still have
>> never answered this fundamental, that the narrow builds of Python are
>> *BUGGY* in the same way that JavaScript/ECMAScript is.
>
> Interesting. From my work I was under the impression that I knew ECMAScript
> and its implementations fairly well, yet I have never heard of this before.
>
> What do you mean by “narrow build” and “wide build” and what exactly is the
> bug “narrow builds” of Python 3.2 have in common with JavaScript/ECMAScript?
> To which implementation of ECMAScript are you referring – or are you
> referring to the Specification as such?
The ECMAScript spec says that strings are stored and represented in
UTF-16. Python versions up to 3.2 came in two varieties: narrow, which
included (I believe) the Windows builds available on python.org, and
wide, which was (again, I think) the default Linux config. The problem
predates Python 3 and its default string being Unicode - the Py2
unicode type has the same issue:
Python 2.6.5 (r265:79096, Mar 19 2010, 21:48:26) [MSC v.1500 32 bit
(Intel)] on win32
>>> u"\U00012345"
u'\U00012345'
>>> len(_)
2
Python 2.6.6 (r266:84292, Sep 15 2010, 15:52:39)
[GCC 4.4.5] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> u"\U00012345"
u'\U00012345'
>>> len(_)
1
That's the Python msi installer, and the default system Python from an
Ubuntu 10.10. The exact same code does different things on different
platforms, and on the Windows (narrow-build), it's possible to split
surrogates:
>>> u"\U00012345"[0]
u'\ud808'
>>> u"\U00012345"[1]
u'\udf45'
You can see the same thing in Javascript too. Here's a little demo I
just knocked together:
<script>
function foo()
{
var txt=document.getElementById("in").value;
var msg="";
for (var i=0;i<txt.length;++i) msg+="["+i+"]: "+txt.charCodeAt(i)+"
"+txt.charCodeAt(i).toString(16)+"\n";
document.getElementById("out").value=msg;
}
</script>
<input id=in><input type=button onclick="foo()"
value="Show"><br><textarea id=out rows=25 cols=80></textarea>
Give it an ASCII string and you'll see, as expected, one index (based
on string indexing or charCodeAt, same thing) for each character. Same
if it's all BMP. But put an astral character in and you'll see
00.00.d8.00/24 (oh wait, CIDR notation doesn't work in Unicode) come
up. I raised this issue on the Google V8 list and on the ECMAScript
list es-discuss@mozilla.org, and was basically told that since
JavaScript has been buggy for so long, there's no chance of ever
making it bug-free:
https://mail.mozilla.org/pipermail/es-discuss/2012-December/027384.html
Fortunately for Python, there are version numbers, and policies that
permit bugs to actually get fixed. (Which is why, for instance, Debian
Squeeze still ships Python 2.6 rather than upgrading to 2.7 - in case
some script is broken by that change. Can't do that with web
browsers.) As of Python 3.3, all Pythons function the same way: it's
semantically a "wide build" (UTF-32), but with a memory usage
optimization. That's how it needs to be.
ChrisA
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
A reply for rusi (FSR) jmfauth <wxjmfauth@gmail.com> - 2013-03-13 02:36 -0700
Re: A reply for rusi (FSR) rusi <rustompmody@gmail.com> - 2013-03-13 03:07 -0700
String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-13 03:11 -0700
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-13 21:59 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-13 09:49 -0700
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 10:43 +1100
Re: String performance regression from python 3.2 to 3.3 MRAB <python@mrabarnett.plus.com> - 2013-03-14 00:52 +0000
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 11:55 +1100
Re: String performance regression from python 3.2 to 3.3 MRAB <python@mrabarnett.plus.com> - 2013-03-14 02:01 +0000
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-14 04:05 +0000
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 17:47 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-14 03:48 -0700
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-14 19:14 -0400
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-14 20:48 -0400
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 10:07 -0700
RE: String performance regression from python 3.2 to 3.3 Andriy Kornatskyy <andriy.kornatskyy@live.com> - 2013-03-15 21:04 +0300
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-13 22:35 -0400
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 17:21 +1100
Re: String performance regression from python 3.2 to 3.3 Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2013-03-13 18:42 +0100
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-14 11:19 +1100
Re: String performance regression from python 3.2 to 3.3 Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2013-03-16 03:44 +0100
Re: String performance regression from python 3.2 to 3.3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-16 03:56 +0000
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 21:26 -0700
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-16 08:47 +0000
Re: String performance regression from python 3.2 to 3.3 Neil Hodgson <nhodgson@iinet.net.au> - 2013-03-17 09:00 +1100
Re: String performance regression from python 3.2 to 3.3 Roy Smith <roy@panix.com> - 2013-03-16 18:10 -0400
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-16 14:59 +1100
Re: String performance regression from python 3.2 to 3.3 Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2013-03-16 05:12 +0100
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-16 15:20 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 22:21 -0700
Re: String performance regression from python 3.2 to 3.3 Chris Angelico <rosuav@gmail.com> - 2013-03-16 15:09 +1100
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-15 21:35 -0700
Re: String performance regression from python 3.2 to 3.3 Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-03-16 04:56 +0000
Re: String performance regression from python 3.2 to 3.3 Terry Reedy <tjreedy@udel.edu> - 2013-03-16 01:05 -0400
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-16 05:38 +0000
Re: String performance regression from python 3.2 to 3.3 Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-16 05:25 +0000
Re: String performance regression from python 3.2 to 3.3 Roy Smith <roy@panix.com> - 2013-03-16 09:29 -0400
Re: String performance regression from python 3.2 to 3.3 rusi <rustompmody@gmail.com> - 2013-03-16 09:39 -0700
Re: String performance regression from python 3.2 to 3.3 Roy Smith <roy@panix.com> - 2013-03-16 14:00 -0400
Re: String performance regression from python 3.2 to 3.3 jmfauth <wxjmfauth@gmail.com> - 2013-03-16 13:42 -0700
Re: A reply for rusi (FSR) Chris Angelico <rosuav@gmail.com> - 2013-03-13 21:32 +1100
csiph-web