Path: csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!ecngs!feeder2.ecngs.de!newsfeed.freenet.ag!news2.euro.net!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '"""': 0.05; 'none:': 0.05; 'class,': 0.07; 'function,': 0.07; 'tmp': 0.07; 'python': 0.09; 'dict': 0.09; 'func': 0.09; 'ks,': 0.09; 'none.': 0.09; 'python:': 0.09; 'subset': 0.09; 'url:github': 0.09; '{})': 0.09; 'cc:addr :python-list': 0.10; 'def': 0.10; 'sat,': 0.15; '(key,': 0.16; 'both)': 0.16; 'classes:': 0.16; 'confusion': 0.16; 'd",': 0.16; 'd2):': 0.16; 'distinct': 0.16; 'equivalence': 0.16; 'everybody.': 0.16; 'example)': 0.16; 'intersection': 0.16; 'iterable,': 0.16; 'iterable:': 0.16; 'looping': 0.16; 'operation.': 0.16; 'pairs': 0.16; 'python;': 0.16; 'pythonic': 0.16; 'rewritten': 0.16; 'timed': 0.16; 'wrote:': 0.17; 'alternate': 0.17; 'basically': 0.17; 'passes': 0.17; 'projects,': 0.17; '(or': 0.18; 'define': 0.20; 'equivalent': 0.20; 'written': 0.20; 'trying': 0.21; 'bit': 0.21; 'finally,': 0.22; 'keys': 0.22; 'cheers,': 0.23; 'cc:2**0': 0.23; "haven't": 0.23; "i've": 0.23; 'raise': 0.24; 'cc:no real name:2**0': 0.24; 'pass': 0.25; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'values': 0.26; 'implemented': 0.27; 'message-id:@mail.gmail.com': 0.27; 'decide': 0.28; 'faster,': 0.29; 'mind,': 0.29; 'sets.': 0.29; 'value)': 0.29; 'array': 0.29; "i'm": 0.29; 'classes': 0.30; 'could': 0.32; 'skip:s 30': 0.33; 'function.': 0.33; 'shorter': 0.33; 'values.': 0.33; 'hi,': 0.33; 'received:google.com': 0.34; 'returning': 0.35; 'sequence': 0.35; 'open': 0.35; 'pm,': 0.35; 'similar': 0.35; 'received:209.85': 0.35; 'really': 0.36; 'but': 0.36; 'wanted': 0.36; 'method': 0.36; 'useful': 0.36; "i'll": 0.36; 'two': 0.37; 'why': 0.37; '(for': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'object': 0.38; 'nothing': 0.38; 'instead': 0.39; 'takes': 0.39; 'notice': 0.39; 'where': 0.40; 'header:Received:5': 0.40; "you've": 0.61; 'containing': 0.61; 'between': 0.63; 'jul': 0.65; 'cobol': 0.84; 'relation.': 0.84; 'faster.': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=dUzCSs7wFXVf/0xe7OdCWB0Rn0B5qiBeKXc+vW6tRm0=; b=Bks8/TNluzYOGSbxjqmCYNgsYh4eHt7X9saqt5hCLyws5xOk9fLP10Zuw44SnHRSnA 6J6vp5PPbSs9Nj9//GIesidchCJ3nhPyemze+8/Ak7M4lSQpgjg6Z8Z/+TbYl4lHNbG/ kSA2wUqsRCAUFWfu4MfRKJZhKVEdR20onTZfc/P+nnctdSQSC6lKNFaSh3LIiLjp2AUX eJvSt6SMqbSX2c53mDPaeMBw6PT0n+LdK7rX83+gtwkkPny1Yn5P1vUeCDwZrtPPnRL7 JZU/MOmZ2DCmVDITNzEF9IfP08VaDNX7DXaSFI7+LrRfoVIPiJVfxU07NMgVEZ+Y+BZb ldlw== MIME-Version: 1.0 In-Reply-To: <38c6d4cb-89b3-4f78-a966-be5ddccec845@googlegroups.com> References: <38c6d4cb-89b3-4f78-a966-be5ddccec845@googlegroups.com> From: Ian Kelly Date: Sun, 15 Jul 2012 11:38:06 -0600 Subject: Re: Request for useful functions on dicts To: Leif Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: python-list@python.org X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 84 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1342373918 news.xs4all.nl 6981 [2001:888:2000:d::a6]:39156 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:25361 On Sat, Jul 14, 2012 at 5:56 PM, Leif wrote: > Hi, everybody. I am trying to collect all the functions I've found useful= for working with dicts into a library: > > https://github.com/leifp/dictutil > > If you have a favorite dict-related func / class, or know of similar proj= ects, please let me know (or open an issue on github). Bear in mind that th= e functions don't even have to come from python; if you have a favorite PHP= / APL / COBOL / etc associative array function, that's fair game. Nothing in particular comes to mind, but I'll be happy to critique what you've already got. One thing that bothers me a bit is that get_in will accept any iterable, but set_in and update_in can only accept a sequence in order to do the slicing. Here's an alternate implementation for set_in that takes any iterable: def set_in(d, ks, v): tmp =3D d i =3D None for i, next_key in enumerate(ks): if i > 0: tmp =3D tmp.setdefault(current_key, {}) current_key =3D next_key if i is None: raise KeyError("Empty keys iterable") tmp[current_key] =3D v update_in could be rewritten similarly. You might also notice that I'm not returning d here. That's because the Pythonic way is to either create a new object and return it, or mutate the existing object and return None. If you mutate the existing object and return it, that can lead to confusion about which style the method takes. This is why list.append (for example) always returns None. In Python 2.7+, intersection and difference could be written using dictviews, which act like sets. partition_on_value, partition_on_key: The only difference between these is whether it passes the key or the value to the predicate. You could just pass both and let the predicate decide which one (or both) to use, and then you only need a single function. Also, why a predicate specifically? This could be generalized to partition any equivalence relation, not just those with only two equivalence classes: def partition(f, d): """Partition the dict according to an equivalence relation. Calls f(key, value) for all (key, value) pairs in the dict d. The retu= rn value of f must be hashable. Returns a new dict where the keys are distinct return values of f, and = the values are dicts containing the equivalence classes distinguished by th= ose return values. """ partition =3D defaultdict(dict) for k, v in d.iteritems(): partition[f(k, v)][k] =3D v return partition If you still wanted the predicate semantics, you could then define that as a wrapper: def partition_pred(f, d): p =3D partition(lambda k,v: bool(f(k,v)), d) return p[True], p[False] issubdict could be implemented as a subset operation. I haven't timed it so it may not really be any faster, but this way the looping is in C instead of in Python: def issubdict(d1, d2): return set(d1.items()).issubset(d2.items()) Finally, isempty(d) is basically equivalent to "not d", which is both shorter and faster. Cheers, Ian