Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.glorb.com!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.004 X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'string.': 0.04; 'ascii': 0.07; 'bytes.': 0.07; 'python': 0.09; '(although': 0.09; 'regression': 0.09; 'vast': 0.09; 'worse': 0.09; 'stored': 0.10; 'thread': 0.11; 'subject:python': 0.11; 'cases': 0.15; 'to:name :python-list': 0.15; 'is).': 0.16; 'it..': 0.16; 'obsessed': 0.16; 'troll': 0.16; 'wider': 0.16; 'string': 0.17; 'wrote:': 0.17; 'byte': 0.17; 'bytes': 0.17; 'unicode': 0.17; 'tim': 0.18; 'memory': 0.18; 'email addr:gmail.com>': 0.20; 'occurs': 0.22; 'rapidly': 0.22; 'header:In-Reply-To:1': 0.25; 'appear': 0.26; '(which': 0.26; 'creating': 0.26; '----': 0.27; 'message- id:@mail.gmail.com': 0.27; 'actual': 0.28; 'consequence': 0.29; 'character': 0.29; 'skip:& 10': 0.29; 'sense': 0.31; 'code': 0.31; 'point': 0.31; 'to:addr:python-list': 0.33; 'received:google.com': 0.34; 'received:209.85': 0.35; 'list.': 0.35; 'but': 0.36; 'characters': 0.36; 'expensive': 0.36; 'possible': 0.37; 'keeps': 0.37; 'usual': 0.37; 'received:209': 0.37; 'perform': 0.38; 'possible.': 0.38; 'shows': 0.38; 'performance': 0.39; 'to:addr:python.org': 0.39; 'received:209.85.214': 0.39; 'where': 0.40; 'decision': 0.60; 'easy': 0.60; 'range': 0.60; 'most': 0.61; 'real': 0.61; 'situation': 0.62; 'mentioned': 0.63; 'more': 0.63; 'become': 0.65; 'attention': 0.75; 'participants': 0.78; '(increase': 0.84; '2013': 0.84; 'delaney': 0.84; 'regarded': 0.84; 'reducing': 0.95 X-USANET-Received: from gwo1.mbox.net [127.0.0.1] by gwo1.mbox.net via mtad (C8.MAIN.3.82G) with ESMTP id 845RcTVc30544Mo1; Wed, 20 Mar 2013 21:02:54 -0000 X-USANET-GWS2-Tagid: UNKN X-USANET-Source: 165.212.120.254 OUT tim.delaney@aptare.com S1P5HUB1.EXCHPROD.USA.NET X-USANET-MsgId: XID337RcTVc31888Xo1 MIME-Version: 1.0 X-Received: by 10.60.6.199 with SMTP id d7mr5036289oea.137.1363813373660; Wed, 20 Mar 2013 14:02:53 -0700 (PDT) In-Reply-To: <987098b6-d79c-4597-b656-9b3e983740e8@z3g2000vbg.googlegroups.com> References: <987098b6-d79c-4597-b656-9b3e983740e8@z3g2000vbg.googlegroups.com> Date: Thu, 21 Mar 2013 08:02:53 +1100 Subject: Re: "monty" < "python" From: Tim Delaney To: Python-List Content-Type: multipart/alternative; boundary="e89a8fb1fd74ad5ff404d8618b6b" X-Originating-IP: [209.85.214.175] X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 77 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1363813553 news.xs4all.nl 6881 [2001:888:2000:d::a6]:57641 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:41623 --e89a8fb1fd74ad5ff404d8618b6b Content-Type: text/plain; charset="UTF-8" On 21 March 2013 06:40, jmfauth wrote: > ---- > [snip usual rant from jmf] Franz, please pay no attention to jmf. He has become obsessed with a single small regression in Python 3.3 in performance with how strings perform in a very small domain that rarely shows up in practice (although as he has demonstrated, it is easy to create a microbenchmark that makes it appear to be much worse than it is). The regression is a consequence of the decision in Python 3.3 to *correctly* support the full range of Unicode characters whilst also reducing the required memory where possible. In the vast majority of cases this is a performance *improvement*. It is only "optimised for the ascii user" in the sense that in the Unicode standard the pre-existing ASCII characters only require 1 byte per code point and hence can be stored in less memory than most other Unicode code points. The possible character widths are 1, 2 and 4 bytes. The actual regression occurs when concatentating/replacing/etc a character to a string that is wider than any other character currently in the string. In this situation the new string needs to be widened (increase the number of bytes used by every character) which is a much more expensive operation than simply creating a new string (which is what would happen if the character was the same size or smaller). It has been acknowledged as a real regression, but he keeps hijacking every thread where strings are mentioned to harp on about it. He has shown no inclination to attempt to *fix* the regression and is rapidly coming to be regarded as a troll by most participants in this list. Tim Delaney --e89a8fb1fd74ad5ff404d8618b6b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On 21 March 2013 06:40, jmfauth <wxjmfauth@gmail.com> wrote:
----
[snip usual rant from jmf]

Franz, please pay no = attention to jmf. He has become obsessed with a single small regression in = Python 3.3 in performance with how strings perform in a very small domain t= hat rarely shows up in practice (although as he has demonstrated, it is eas= y to create a microbenchmark that makes it appear to be much worse than it = is).

The regression is a consequence of the decision in Python 3.= 3 to *correctly* support the full range of Unicode characters whilst also r= educing the required memory where possible. In the vast majority of cases t= his is a performance *improvement*. It is only "optimised for the asci= i user" in the sense that in the Unicode standard the pre-existing ASC= II characters only require 1 byte per code point and hence can be stored in= less memory than most other Unicode code points. The possible character wi= dths are 1, 2 and 4 bytes.

The actual regression occurs when concatentating/replac= ing/etc a character to a string that is wider than any other character curr= ently in the string. In this situation the new string needs to be widened (= increase the number of bytes used by every character) which is a much more = expensive operation than simply creating a new string (which is what would = happen if the character was the same size or smaller).

It has been acknowledged as a real regression, but he k= eeps hijacking every thread where strings are mentioned to harp on about it= . He has shown no inclination to attempt to *fix* the regression and is rap= idly coming to be regarded as a troll by most participants in this list.

Tim Delaney=C2=A0
--e89a8fb1fd74ad5ff404d8618b6b--