Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.160 X-Spam-Level: * X-Spam-Evidence: '*H*': 0.70; '*S*': 0.02; 'string': 0.09; 'builtin': 0.09; 'cc:addr:python-list': 0.11; 'rebuild': 0.16; 'sender:addr:gmail.com': 0.17; 'wrote:': 0.18; 'not,': 0.20; 'seems': 0.21; '>>>': 0.22; 'cc:addr:python.org': 0.22; 'cc:2**0': 0.24; 'header:In-Reply-To:1': 0.27; 'message-id:@mail.gmail.com': 0.30; "skip:' 10": 0.31; '>>>>': 0.31; 'faster,': 0.31; 'received:google.com': 0.35; 'i.e.': 0.36; 'should': 0.36; 'too': 0.37; 'implement': 0.38; '8bit%:86': 0.38; 'mapping': 0.38; 'does': 0.39; 'analyze': 0.60; 'most': 0.60; 'real': 0.63; 'such': 0.63; 'july': 0.63; 'to:addr:gmail.com': 0.65; 'overall': 0.69; 'theoretical': 0.74; '.replace': 0.84; 'subject:skip:S 10': 0.84; '2013': 0.98 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=vd64ZrvX6eKGf5XisDPL5CQlUNfz8FXBx4jE6jTWdEA=; b=t4F0Rmw7rGGrCHFpwgO8Su0rm8f5x1e5maRe+wJAonOxUiX67N1PpHJLn/PgqptULf I7mRqRN60XqRfcDC/XbKJYaTby4eqpdnjVDTEpP1noNN+3u/Ep5Kdnj2Mk301vNee1Ft PsCWIZN9aFS9vcnPSwjxSiqv8QKY9+cCd6WdrdwbohmIgmDnwm3KgGm0HiDl1dTxWyNB yKHL6jyR3JESl7wLKuMRnS2zDYlJ4y32Cp65nbdblPapZ+psfrcA5oLw7JvoR8ritd2m ORG8sGWn8iitXSRIu6D9GiouJFYobBabN8oPoXWKBuiFV+HRuLfWcUn85WELwejMak7R Ko6A== X-Received: by 10.152.120.228 with SMTP id lf4mr10782687lab.65.1374411009496; Sun, 21 Jul 2013 05:50:09 -0700 (PDT) MIME-Version: 1.0 Sender: joshua.landau.ws@gmail.com In-Reply-To: References: <51e967bb$0$29971$c3e8da3$5496439d@news.astraweb.com> From: Joshua Landau Date: Sun, 21 Jul 2013 13:49:29 +0100 X-Google-Sender-Auth: VtGdN_LEl_FCqOwEkiw12OZfdL4 Subject: Re: Find and Replace Simplification To: Serhiy Storchaka Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: python-list X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 32 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1374411017 news.xs4all.nl 15938 [2001:888:2000:d::a6]:40159 X-Complaints-To: abuse@xs4all.nl Path: csiph.com!usenet.pasdenom.info!news.franciliens.net!feed.ac-versailles.fr!nerim.net!novso.com!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Xref: csiph.com comp.lang.python:51011 On 21 July 2013 13:28, Serhiy Storchaka wrote: > 21.07.13 14:29, Joshua Landau =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0=B2(= =D0=BB=D0=B0): > >> On 21 July 2013 08:44, Serhiy Storchaka wrote: >>> >>> 20.07.13 20:03, Joshua Landau =D0=BD=D0=B0=D0=BF=D0=B8=D1=81=D0=B0=D0= =B2(=D0=BB=D0=B0): >>> >>>> Still, it seems to me that it should be optimizable for sensible >>>> builtin types such that .translate is significantly faster, as there's >>>> no theoretical extra work that .translate *has* to do that .replace >>>> does not, and .replace also has to rebuild the string a lot of times. >>> >>> >>> You should analyze overall mapping and reorder items in right order (if >>> it >>> possible), i.e. '&' should be replaced before '<' in html.escape. This >>> extra >>> work is too large for most real input. >> >> >> I don't understand. What items are you reordering? > > > mapping.items(). We can implement s.translate({ord('<'): '<', ord('&')= : > '&'}) as s.replace('&', '&').replace('<', '<'), but not as > s.replace('<', '<').replace('&', '&'). I see -- that won't always be the case though, as there can be "loops" aka "ab" -> "ba".