Path: csiph.com!usenet.pasdenom.info!aioe.org!news.stack.nl!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.002 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'cpython': 0.05; 'encoded': 0.05; 'python': 0.09; 'it;': 0.09; 'mercurial': 0.09; 'newest': 0.09; 'terry': 0.09; 'thread,': 0.09; 'bug': 0.10; 'def': 0.10; '(the': 0.15; 'to:name:python-list': 0.15; '>the': 0.16; 'email addr:udel.edu>': 0.16; 'jython,': 0.16; 'mine.': 0.16; 'reedy': 0.16; 'subject:String': 0.16; 'substituted': 0.16; 'url:whatsnew': 0.16; 'zero).': 0.16; 'string': 0.17; 'wrote:': 0.17; "shouldn't": 0.17; 'solution.': 0.18; 'tim': 0.18; 'windows': 0.19; 'versions': 0.20; 'together.': 0.21; '>': 0.23; 'linux': 0.24; 'header:In- Reply-To:1': 0.25; 'plain': 0.27; '2.6': 0.27; 'message- id:@mail.gmail.com': 0.27; '3.1': 0.29; 'efficiently': 0.29; 'statements': 0.29; 'case,': 0.29; 'skip:& 10': 0.29; '(from': 0.30; 'seconds': 0.30; 'code': 0.31; 'december': 0.32; 'url:python': 0.32; 'certain': 0.33; 'turns': 0.33; 'problem': 0.33; 'to:addr:python-list': 0.33; 'recommended': 0.33; 'version': 0.34; 'received:google.com': 0.34; 'list': 0.35; 'faster': 0.35; 'skip:k 20': 0.35; 'received:209.85': 0.35; 'there': 0.35; 'but': 0.36; 'url:org': 0.36; 'method': 0.36; 'optimization': 0.37; 'rather': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'url:docs': 0.38; 'to:addr:python.org': 0.39; 'where': 0.40; 'header:Received:5': 0.40; 'john': 0.60; 'back': 0.62; 'situation': 0.62; 'more': 0.63; 'taking': 0.65; 'real-world': 0.65; 'due': 0.66; 'url:4': 0.72; '*in': 0.84; 'delaney': 0.84; 'idiom': 0.84; 'optimisation': 0.84; 'subject:..': 0.96 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=fYEoJQH93wmdaSMWSoRT7zstNBn34OK9nMuwrfWLmFE=; b=MjAtJfDluRLNur9vhtfPMq05nFtNHTAzN6byGm2ZE/LrEUgZVTBQbpR/CPt0J9KzFJ xcxneqyO04qFx8tUuda+zuZ9erZ7axzH3WZxCMgV+QgHu0gmT1wyF5wyhmouSFe7Hsap XtEf+ZcaPEet8oKYjY7Z/MESBwkY4K0Wl/2wRUSQhcpneEjwv3EzPHxQ3fXgwIGraqUL JpdPCbxf+hfU6CfZ/x21y8ZQ5hglu6i2t9Bp+BXgEBiFpxRCxfm7GXvJKu8aLLyjBaxV yt8bS3UH9gbmLIdiFho4YRF6T7Xl/Uc24NazrOeGkwf9SUIYVCTAmtpN+nz3TaUJYcEl IAFQ== MIME-Version: 1.0 In-Reply-To: References: Date: Wed, 12 Dec 2012 08:34:46 +1100 Subject: Re: String manipulation in python..NEED HELP!!!! From: Tim Delaney To: Python-List Content-Type: multipart/alternative; boundary=f46d044272486f3b0c04d09a735c X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 102 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1355261688 news.xs4all.nl 6857 [2001:888:2000:d::a6]:57188 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:34649 --f46d044272486f3b0c04d09a735c Content-Type: text/plain; charset=UTF-8 On 12 December 2012 07:52, Ross Ridge wrote: > John Gordon wrote: > > def encode(plain): > > '''Return a substituted version of the plain text.''' > > encoded = '' > > for ch in plain: > > encoded += key[alpha.index(ch)] > > return encoded > > Terry Reedy wrote: > >The turns an O(n) problem into a slow O(n*n) solution. Much better to > >build a list of chars and then join them. > > There have been much better suggestions in this thread, but John Gordon's > code above is faster than the equivilent list and join implementation > with Python 2.6 and Python 3.1 (the newest versions I have handy). > CPython optimized this case of string concatenation into O(n) back in > Python 2.4. > >From "What's New in Python 2.4": http://docs.python.org/release/2.4.4/whatsnew/node12.html#SECTION0001210000000000000000 String concatenations in statements of the form s = s + "abc" and s += "abc" are now performed more efficiently *in certain circumstances*. This optimization *won't be present in other Python implementations such as Jython*, so you shouldn't rely on it; using the join() method of strings is still recommended when you want to efficiently glue a large number of strings together. Emphasis mine. The optimisation was added to improve the situation for programs that were already using the anti-pattern of string concatenation, not to encourage people to use it. As a real-world case, a bug was recently found in Mercurial where an operation on Windows was taking orders of magnitudes longer than on Linux due to use of string concatenation rather than the join idiom (from ~12 seconds spent on string concatenation to effectively zero). Tim Delaney --f46d044272486f3b0c04d09a735c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On 12 December 2012 07:52, Ross Ridge <rridge@csclub.uwaterl= oo.ca> wrote:
John Gordon wrote:
> def encode(plain):
> =C2=A0 =C2=A0 '''Return a substituted version of the plain= text.'''
> =C2=A0 =C2=A0 encoded =3D ''
> =C2=A0 =C2=A0 for ch in plain:
> =C2=A0 =C2=A0 =C2=A0 =C2=A0encoded +=3D key[alpha.index(ch)]
> =C2=A0 =C2=A0 return encoded

Terry Reedy =C2=A0<tjreedy@udel.edu> wrote:
>The turns an O(n) problem into a slow O(n*n) solution. Much better to >build a list of chars and then join them.

There have been much better suggestions in this thread, but John Gord= on's
code above is faster than the equivilent list and join implementation
with Python 2.6 and Python 3.1 (the newest versions I have handy).
CPython optimized this case of string concatenation into O(n) back in
Python 2.4.

From "What's New i= n Python 2.4":

String concatenations in statements of the form s= =3D s + "abc" and s +=3D "abc" are now performed more = efficiently=C2=A0in certain circumstances. This optimization=C2=A0won't be present in other Python implementations such as Jython, s= o you shouldn't rely on it; using the join() method of strings is still= recommended when you want to efficiently glue a large number of strings to= gether.=C2=A0

Emphasis mine.

The optim= isation was added to improve the situation for programs that were already u= sing the anti-pattern of string concatenation, not to encourage people to u= se it.

As a real-world case, a bug was recently found in Mercu= rial where an operation on Windows was taking orders of magnitudes longer t= han on Linux=C2=A0due to use of string concatenation rather than the join i= diom (from ~12 seconds spent on string concatenation to effectively zero).<= /div>

Tim Delaney
=C2=A0
--f46d044272486f3b0c04d09a735c--