Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Chris Angelico Newsgroups: comp.lang.python Subject: Re: Which type should be used when testing static structure appartenance Date: Thu, 19 Nov 2015 00:42:32 +1100 Lines: 47 Message-ID: References: <564c62f3$0$1593$c3e8da3$5496439d@news.astraweb.com> <564c7d80$0$1606$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 X-Trace: news.uni-berlin.de i0Jq1o2vM/iNEpR3wTIb8wYHluXEV6ht4EvhfrdbnE0w== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'received:209.85.223': 0.03; 'python3': 0.05; '(of': 0.07; 'strings.': 0.07; 'cc:addr :python-list': 0.09; 'literal': 0.09; 'sake': 0.09; 'tuple.': 0.09; 'python': 0.10; 'ignore': 0.14; 'wed,': 0.15; 'thu,': 0.15; '"all"': 0.16; '"is': 0.16; '(assuming': 0.16; '10000000': 0.16; '12:30': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'perfect.': 0.16; 'py3': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'subject:type': 0.16; 'subject:when': 0.16; 'wrote:': 0.16; 'string': 0.17; 'basically': 0.18; 'figures': 0.18; 'versions': 0.20; '2015': 0.20; 'cc:2**0': 0.20; 'cc:addr:python.org': 0.20; 'advocate': 0.22; 'constant': 0.22; 'am,': 0.23; 'seems': 0.23; 'represents': 0.23; 'sets': 0.23; 'header:In-Reply-To:1': 0.24; "doesn't": 0.26; 'example': 0.26; 'chris': 0.26; 'checking': 0.27; 'not.': 0.27; 'message- id:@mail.gmail.com': 0.27; '2.6': 0.27; 'argue': 0.29; 'loop,': 0.29; 'code': 0.30; 'are:': 0.32; 'though,': 0.32; 'point': 0.33; 'options': 0.33; 'choices': 0.33; "d'aprano": 0.33; 'steven': 0.33; 'surely': 0.33; 'equal': 0.34; 'received:google.com': 0.35; 'nov': 0.35; 'but': 0.36; 'should': 0.36; 'there': 0.36; 'received:209.85': 0.36; 'skip:{ 10': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'two': 0.37; 'say': 0.37; 'difference': 0.38; 'received:209': 0.38; 'does': 0.39; 'rather': 0.39; 'where': 0.40; 'still': 0.40; 'questions': 0.40; 'forget': 0.60; 'your': 0.60; 'per': 0.62; 'course': 0.62; 'strange': 0.63; 'times': 0.63; 'between': 0.65; 'talking': 0.67; 'construction': 0.72; 'accurately': 0.84; 'chrisa': 0.84; "op's": 0.84; 'sets,': 0.84; 'to:none': 0.91; 'misleading': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; bh=duHhDok7URr5ioKnlTMs66QomqaQUc2nmaqipUT9Dro=; b=s1blL9/hZnJ4tE1Pi46nVhQ0TT2892rea6YAanLcUbtuledXz6G4W39nVmo1WOZJEV Hpj2ol+WiQfX22/ktqdA2lR+bC2FIIAwpwSNz9XjeMpBWR2m311SwdcZnD4EbPGpMweB 5zfTPwmhmTJ3bINk66qKMclNZUl5U+lyDeKGFoqycUM6zO0XV1J0kJmuhh0m67p5yQqV zJAaJ87E6niViW00zAIbpG1+MH2Kp+sDEfZ54+27DS9I+TJnMd6WaKq0p5foDq1NzwTX hZZqVngr6uMGsmlw+tb3uMiTxbUhohp6rLBoDkvFQfriMoToGLOd0zBZPu+YaFHrU+RK vSGw== X-Received: by 10.107.16.84 with SMTP id y81mr2673985ioi.19.1447854152772; Wed, 18 Nov 2015 05:42:32 -0800 (PST) In-Reply-To: <564c7d80$0$1606$c3e8da3$5496439d@news.astraweb.com> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:98971 On Thu, Nov 19, 2015 at 12:30 AM, Steven D'Aprano wrote: > On Wed, 18 Nov 2015 11:40 pm, Chris Angelico wrote: >> All the questions of performance should be >> secondary to code clarity, though; > > "All"? Surely not. The OP's example was checking if a string was equal to either of two strings. Even if that's in a tight loop, the performance difference between the various options is negligible. The "all" is a little misleading (of course there are times when you warp your code for the sake of performance), but I was talking about this example, where it's basically coming down to microbenchmarks. >> so I would say the choices are: Set >> literal if available, else tuple. Forget the performance. > > It seems rather strange to argue that we should ignore performance when the > whole reason for using sets in the first place is for performance. They do perform well, but that's not the main point - not when you're working with just two strings. Of course, when you can get performance AND readability, it's perfect. That doesn't happen with Py2 sets, but it does with Python 3: rosuav@sikorsky:~$ python -m timeit -s "x='asdf'" "x in {'asdf','qwer'}" 10000000 loops, best of 3: 0.12 usec per loop rosuav@sikorsky:~$ python -m timeit -s "x='asdf'" "x in ('asdf','qwer')" 10000000 loops, best of 3: 0.0344 usec per loop rosuav@sikorsky:~$ python -m timeit -s "x='asdf'" "x=='asdf' or x=='qwer'" 10000000 loops, best of 3: 0.0392 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s "x='asdf'" "x in {'asdf','qwer'}" 10000000 loops, best of 3: 0.0356 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s "x='asdf'" "x in ('asdf','qwer')" 10000000 loops, best of 3: 0.0342 usec per loop rosuav@sikorsky:~$ python3 -m timeit -s "x='asdf'" "x=='asdf' or x=='qwer'" 10000000 loops, best of 3: 0.0418 usec per loop No set construction in Py3 - the optimizer figures out that you don't need mutability, and uses a constant frozenset. (Both versions do this with list->tuple.) Despite the performance hit from using a set in Py2, though, I would still advocate its use (assuming you don't need to support 2.6 or older), because it accurately represents the *concept* of "is this any one of these". ChrisA