Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.010 X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; '(at': 0.04; 'anyway.': 0.05; 'string': 0.09; 'builtin': 0.09; 'translate': 0.10; 'python': 0.11; 'def': 0.12; 'iterables': 0.16; 'iterated': 0.16; 'mappings,': 0.16; 'quote=true):': 0.16; 'str,': 0.16; 'trivially': 0.16; 'tuple,': 0.16; 'worst': 0.16; 'wrote:': 0.18; 'basically': 0.19; 'passing': 0.19; '>>>': 0.22; 'header:User- Agent:1': 0.23; 'bytes': 0.24; 'replace': 0.24; 'second': 0.26; 'least': 0.26; 'header:In-Reply-To:1': 0.27; 'appear': 0.29; 'am,': 0.29; "doesn't": 0.30; 'characters': 0.30; 'fastest': 0.30; 'received:10.0.0': 0.31; "d'aprano": 0.31; 'really,': 0.31; 'steven': 0.31; 'lists': 0.32; 'could': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'ordered': 0.36; 'method': 0.36; 'shows': 0.36; 'thanks': 0.36; 'received:10.0': 0.36; 'should': 0.36; 'received:10': 0.37; '8bit%:86': 0.38; 'to:addr:python-list': 0.38; 'list,': 0.38; 'fact': 0.38; 'expect': 0.39; 'does': 0.39; 'though,': 0.39; 'to:addr:python.org': 0.39; 'how': 0.40; 'back': 0.62; 'high': 0.63; 'july': 0.63; 'dict,': 0.84; 'faster.': 0.84; 'penalty': 0.84; 'subject:skip:S 10': 0.84; 'reasons,': 0.91; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=WC6NXvvP0am3qei6RUMmBquE6/1fj8jJZY3wcDYkgCw=; b=j1cBmStoeM4xDp5gVITlK30sn8e14J++laew1CdmO4xw7c7NM79Fm+bjqRXlRRV7vh gcsmBLKdMvrSrWULASK71P5XpvHKIgkEfeYeenZou7HaVEBZoPDD1PGrvpcAZXDFWCZO T7+ilE0PKlCdwFz1vuOk+kU86bK9hPpwRvsx8viwNX68cjLMwneP93hdHSslFI4WMoDB +Gt3DYmzpM2lYob7czUAp0IeHzGXhl9HACrdsGLYUf3o0a9LljWXmc9BmRA/pkEwzL9Y 66zz/ntEF0IQgseO5IdBA/sTPDCVOczDU+kGi/86OPiPEAVUFeB28v0Wq0VTnc6/5muX LQxQ== X-Received: by 10.50.40.35 with SMTP id u3mr5716631igk.23.1374324105899; Sat, 20 Jul 2013 05:41:45 -0700 (PDT) Date: Sat, 20 Jul 2013 08:41:44 -0400 From: Devyn Collier Johnson User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130623 Thunderbird/17.0.7 MIME-Version: 1.0 To: Python Mailing List Subject: Re: Find and Replace Simplification References: <51e967bb$0$29971$c3e8da3$5496439d@news.astraweb.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 40 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1374324108 news.xs4all.nl 15911 [2001:888:2000:d::a6]:45521 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:50966 On 07/20/2013 07:16 AM, Joshua Landau wrote: > On 19 July 2013 18:29, Serhiy Storchaka wrote: >> 19.07.13 19:22, Steven D'Aprano написав(ла): >> >>> I also expect that the string replace() method will be second fastest, >>> and re.sub will be the slowest, by a very long way. >> >> The string replace() method is fastest (at least in Python 3.3+). See >> implementation of html.escape() etc. > def escape(s, quote=True): > if quote: > return s.translate(_escape_map_full) > return s.translate(_escape_map) > > I fail to see how this supports the assertion that str.replace() is > faster. However, some quick timing shows that translate has a very > high penalty for missing characters and is a tad slower any way. > > Really, though, there should be no reason for .translate() to be > slower than replace -- at worst it should just be "reduce(lambda s, > ab: s.replace(*ab), mapping.items()¹, original_str)" and end up the > *same* speed as iterated replace. But the fact that it doesn't have to > re-build the string every replace means that theoretically it should > be a lot faster. > > ¹ I realise this won't actually work for several reasons, and doesn't > support things like passing in lists as mappings, but you could > trivially support the important builtin types² and fall back to the > original for others, where the pure-python __getitem__ is going to be > the slowest part anyway. > > ² List, tuple, dict, str, bytes -- so basically just mappings and > ordered iterables Thanks Joshua Landau! str.replace() does appear to be best, so that is the suggestion that I will implement. Mahalo, DCJ