Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Chris Angelico Newsgroups: comp.lang.python Subject: Re: The Cost of Dynamism (was Re: Pyhon 2.x or 3.x, which is faster?) Date: Mon, 14 Mar 2016 08:44:15 +1100 Lines: 66 Message-ID: References: <56e44258$0$1598$c3e8da3$5496439d@news.astraweb.com> <8737rvxs89.fsf@elektro.pacujo.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de hkVOYFTm8IKckoRQOnMw0QkRqpHmUCqy37+Jitvb0w2Q== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'received:209.85.223': 0.03; 'yet.': 0.03; 'elif': 0.04; '(b)': 0.07; 'cc:addr:python-list': 0.09; 'collections': 0.09; 'loop.': 0.09; 'subject:which': 0.09; 'assume': 0.11; 'template': 0.11; 'def': 0.13; '2016': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'iteration,': 0.16; 'match:': 0.16; 'readable': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:?)': 0.16; 'wrote:': 0.16; 'case.': 0.18; 'creates': 0.18; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'fairly': 0.22; '(a)': 0.22; 'am,': 0.23; 'code,': 0.23; 'import': 0.24; 'header:In-Reply-To:1': 0.24; 'mon,': 0.24; 'switch': 0.27; 'message-id:@mail.gmail.com': 0.27; '14,': 0.27; 'moved': 0.27; 'looks': 0.29; 'block,': 0.29; 'clean.': 0.29; 'equally': 0.29; 'faster,': 0.29; 'creating': 0.30; 'code': 0.30; 'version,': 0.30; 'another': 0.32; 'run': 0.33; 'point,': 0.33; 'received:google.com': 0.35; 'could': 0.35; 'done': 0.35; 'false': 0.35; 'functions.': 0.35; 'but': 0.36; 'should': 0.36; 'received:209.85': 0.36; 'subject:: ': 0.37; 'being': 0.37; "won't": 0.38; 'received:209': 0.38; 'christian': 0.38; 'stuff': 0.38; 'does': 0.39; 'enough': 0.39; 'subject:The': 0.61; 'skip:u 10': 0.61; 'default': 0.61; 'more': 0.63; 'mar': 0.65; 'chrisa': 0.84; 'faster.': 0.84; 'gollwitzer': 0.84; 'irrelevant': 0.84; 'tree,': 0.84; 'to:none': 0.91; 'thing,': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc; bh=Dl1M1qloZeagPm+yprzbk5MasrWUsDtlHGNJ+AptB9k=; b=RVfW/ew+jhKA99YVIRn6oJNqZFRno+M5t4wYQvxKKfy7bbi2Jvd/Ktp4loLAYeceoi 7HeNMwlhh/1dv64rCEmu6PmiLN9bvx28tCGzjQeY81Y252fYHKSuIaK8anpG/ZXggNu9 ODVCHWX8bUwd7lr3XscgqZKO1zxfwrr+dSNq3Cs8zK25tYtAnp85Z0DhKI2CfnJIhq/P hCQEVSSEfXWwelWhCaxARLYT9ZCy9gXEVkX4tYTKTI9oGaBfZ4b/ERncfgc6IbudBF9C dqyiMsSPJfnqXgdoJ5aPcMVexMEn0S2TOvX87jNuJDFACj24MV03A9rxZF3ybWAt25oN QsMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:cc; bh=Dl1M1qloZeagPm+yprzbk5MasrWUsDtlHGNJ+AptB9k=; b=NReg1HE9TDf6xBhlGPz15/bzCg341TQhTBLc+jVj6+VzB+3pnd+tWbUsSgywmwdxoI KV6LyVE0wbNcaH4BaEOCMbZZLzg+HdWVw706HgI+U7g9ozGCdVIEDxwbS9q3wOcypOaQ NRXR04e59nRAQ52O6wHZOS+FJnyEM+VlsBbsR8OfBFYu7e5V0rcwmsulRrzH06CSWv31 iN6D8gq78RI5rBm6HlEur4KRr6tP1DqAhPJl2i7Ec/Sj/DhP7NUy8bW46LFkPK7tmSuo A6o8B+mrMJS8GIfZIQd6HP4ofx6NsaGv5rmGdAnVUi2EtTejh/MxZTuswEsgz+P28C/W ZNzQ== X-Gm-Message-State: AD7BkJKNGB4dgA/FhmqeZGYQR/oTS7HQZ+l/Ct38QAvInruS95i/jefE0nmFpbIhrC2DB+Q4YgTuSUw3axW2Pg== X-Received: by 10.107.63.139 with SMTP id m133mr19339968ioa.157.1457905455397; Sun, 13 Mar 2016 14:44:15 -0700 (PDT) In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:104786 On Mon, Mar 14, 2016 at 8:26 AM, Christian Gollwitzer wrote: > I assume you run this in a big loop. What about a single hash-table lookup? > > from collections import Counter > counts=Counter() > for c in whatever: > counts[c]+=1 > > upper=sum(counts[x] for x in range(ord('A'), ord('Z')+1) > lower=sum(counts[x] for x in range(ord('a'), ord('z')+1) > .... > > > I think this is equally readable as the switch version, and should be much > faster. At this point, it's completely moved away from being a switch block, so while it may well be more readable AND faster, it's pretty much irrelevant to the discussion. The value of a switch block is arbitrary code, same as an if/elif tree, without having to package stuff up into functions. Although I could accept a function-based solution if it looks clean enough... case = switch(c) @case("A", "Z") def _(): do_uppercase_stuff @case("a", "z") def _(): do_lowercase_stuff @case("0", "9") def _(): do_digit_stuff @case def _(): do_default_stuff It creates an inner scope, which most people won't need or want, and it's creating a bunch of functions every iteration, but the code's reasonably clean. And it's fairly implementable: def switch(template): def case(testme, *range): if done is not case: return done # Already hit another case. match = False if isinstance(testme, type(case)): # No parens - this is the default case match = True elif range: match = testme <= template <= range[0] else: match = template == testme if match: done = testme() return done done = case # Sentinel: Not done yet. return case Now you can play around with performance questions. But not until the code (a) does the right thing, and (b) looks good enough to maintain. ChrisA