Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!selfless.tophat.at!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '(using': 0.05; 'int': 0.05; 'lesser': 0.07; 'python': 0.08; '(unless': 0.09; 'case).': 0.09; 'closer': 0.09; 'hash': 0.09; 'nameerror:': 0.09; 'now?': 0.09; 'url:dev': 0.09; 'pm,': 0.10; '>>>': 0.12; 'def': 0.12; 'am,': 0.14; 'broken': 0.14; 'wrote:': 0.14; 'defined': 0.14; 'cc:name:python list': 0.16; 'equal,': 0.16; 'furman': 0.16; 'impose': 0.16; 'instance:': 0.16; 'keyerror:': 0.16; 'n):': 0.16; 'rebert': 0.16; 'float': 0.16; 'traceback': 0.16; '(most': 0.16; 'cc:addr:python-list': 0.17; 'cheers,': 0.19; 'header:In-Reply- To:1': 0.21; 'right,': 0.22; 'thu,': 0.22; 'cc:2**0': 0.22; 'keys': 0.23; 'last):': 0.23; 'received:209.85.213.46': 0.23; 'received:mail-yw0-f46.google.com': 0.23; 'fri,': 0.23; 'objects': 0.23; '(or': 0.24; "doesn't": 0.25; 'somebody': 0.25; '(and': 0.25; 'compare': 0.26; 'skip:[ 10': 0.26; 'object': 0.26; 'tried': 0.27; "i'm": 0.27; 'message-id:@mail.gmail.com': 0.28; 'problem': 0.28; 'anyway.': 0.29; 'convention': 0.29; 'forgot': 0.29; 'universal': 0.29; 'class': 0.29; 'instead': 0.29; 'cc:addr:python.org': 0.30; 'basically,': 0.30; 'equal': 0.31; "can't": 0.32; "isn't": 0.33; '...': 0.34; 'chris': 0.34; "we're": 0.34; 'rule': 0.34; '"",': 0.35; 'identical': 0.35; 'implies': 0.35; 'instances': 0.35; 'languages': 0.35; 'using': 0.35; 'several': 0.36; 'received:google.com': 0.37; 'received:209.85': 0.37; '20,': 0.37; 'received:209.85.213': 0.37; 'sequence': 0.37; 'url:docs': 0.37; 'case': 0.37; 'two': 0.37; 'url:python': 0.38; 'hoping': 0.38; 'url:org': 0.38; 'but': 0.38; 'docs': 0.38; 'earlier': 0.38; 'ok,': 0.38; 'subject:: ': 0.38; 'some': 0.38; 'trouble': 0.39; 'said': 0.39; 'received:209': 0.39; 'happen': 0.60; 'more': 0.60; 'matter': 0.63; 'exact': 0.65; 'writers': 0.67; 'care': 0.72; '(contrast': 0.84; '10:56': 0.84; 'one;': 0.84; 'sender:addr:chris': 0.84; 'severe.': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rebertia.com; s=google; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=/XzSGIgNP++9RlEW7em/EOiHTPz2enhaTsgBwRrvkkI=; b=dhHyncdJg5HC1qsidWa5r2FMZ5sSBLo7jhlU7p3QP9wLk/HYj1lih/NZRKOSNTtjTZ 7QjUHlGExHmDgGr4XSSO3wdovtT0dfqs56xPiCpva8WxiN2FesPlMDiPPFezXfqs9WpR klSgij6AJuOgUKREfRmM61je66MKKEJcDYIZI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=rebertia.com; s=google; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=HN1tBEarKGJ3+rslyWHWxi03YyFWLhBocSUyjHK5nX+w+twnSS91hocOTwx+o2hjGg 5u6eBljKkA4Twu0MftAfvINIJG8m5D0bDmOAM6bK/RDYLuUfe8ephT1RGH2nUjMH2s2z g1321bxAnvqcnPe4J5x4zYbtx/EjplVsMshFU= MIME-Version: 1.0 Sender: chris@rebertia.com In-Reply-To: <4DD6AB32.7030306@stoneleaf.us> References: <4DD2C2A5.3080403@stoneleaf.us> <4DD2D89D.4000303@stoneleaf.us> <4DD2F661.2050005@stoneleaf.us> <4DD5FF8F.604@stoneleaf.us> <4DD6AB32.7030306@stoneleaf.us> Date: Fri, 20 May 2011 12:00:43 -0700 X-Google-Sender-Auth: LxhfsI_a6wC4Ag15T-hQw5G-ey4 Subject: Re: hash values and equality From: Chris Rebert To: Ethan Furman Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: python list X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 85 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1305918048 news.xs4all.nl 49047 [::ffff:82.94.164.166]:58424 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:5872 On Fri, May 20, 2011 at 10:56 AM, Ethan Furman wrote: > Chris Rebert wrote: >> On Thu, May 19, 2011 at 10:43 PM, Ethan Furman wrot= e: >>> Several folk have said that objects that compare equal must hash equal, >>> and >>> the docs also state this >>> http://docs.python.org/dev/reference/datamodel.html#object.__hash__ >>> >>> I'm hoping somebody can tell me what horrible thing will happen if this >>> isn't the case? >> Here's a more common/plausible "horrible" case closer to what the docs >> writers had in mind: >> --> class Naughty(object): >> ... =C2=A0 =C2=A0 def __init__(self, n): >> ... =C2=A0 =C2=A0 =C2=A0 =C2=A0 self.n =3D n >> ... =C2=A0 =C2=A0 def __eq__(self, other): >> ... =C2=A0 =C2=A0 =C2=A0 =C2=A0 return self.n =3D=3D other.n >> ... >> --> Naughty(5) =3D=3D Naughty(5) >> True >> --> Naughty(5) is Naughty(5) >> False >> --> bad =3D Naughty(3) >> --> y =3D {bad : 'foo'} >> --> y[bad] # just happens to work >> 'foo' >> --> del bad >> --> # ok, how do we get to 'foo' now? >> --> y[Naughty(3)] # try the obvious way >> Traceback (most recent call last): >> =C2=A0File "", line 1, in >> KeyError: <__main__.Naughty object at 0x2a1cb0> >> --> # We're screwed. >> >> Naughty instances (and similar) can't be used sensibly as hash keys >> (unless you /only/ care about object identity; this is often not the >> case). > > I tried this sequence (using Python 3, BTW -- forgot to mention that litt= le > tidbit -- sorry!): Doesn't matter anyway. > --> del two > --> two > Traceback (most recent call last): > =C2=A0File "", line 1, in > NameError: name 'two' is not defined > --> d > {<__main__.Wierd object at 0x00C0C950>: '3', > =C2=A01: '1.0', > =C2=A02: '2.0', > =C2=A03: '3.0', > =C2=A0<__main__.Wierd object at 0x00B3AC10>: '2', > =C2=A0<__main__.Wierd object at 0x00B32E90>: '1'} > --> new_two =3D Wierd(2) > --> d[new_two] > '2' Right, this is why I went to the trouble of writing Naughty instead of using Wierd. Wierd's exact problem is less common and less severe. The "equality implies identical hash" rule is not a universal one; some other languages instead impose the lesser requirement of "equality and same (or related) types implies identical hash". In Ruby for instance: irb(main):001:0> 1 =3D=3D 1.0 =3D> true irb(main):002:0> a =3D {1.0 =3D> 'hi'} # float key =3D> {1.0=3D>"hi"} irb(main):003:0> a[1] =3D 'bye' # int key =3D> "bye" irb(main):004:0> a # notice how they don't collide: =3D> {1=3D>"bye", 1.0=3D>"hi"} (Contrast this with my earlier analogous Python example.) Basically, Naughty is fundamentally broken [hash(Naughty(2)) !=3D hash(Naughty(2))], whereas Wierd merely defies convention [hash(2) !=3D hash(Wierd(2)) but hash(Wierd(2)) =3D=3D hash(Wierd(2))]. Cheers, Chris -- http://rebertia.com