Groups > comp.lang.java.programmer > #17577 > unrolled thread

hashCode

Started by	bob smith <bob@coolfone.comze.com>
First post	2012-08-10 08:47 -0700
Last post	2012-08-12 17:11 -0400
Articles	20 on this page of 106 — 20 participants

Back to article view | Back to comp.lang.java.programmer

  hashCode bob smith <bob@coolfone.comze.com> - 2012-08-10 08:47 -0700
    Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 12:13 -0400
      Re: hashCode markspace <-@.> - 2012-08-10 10:13 -0700
        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 13:38 -0400
          Re: hashCode rossum <rossum48@coldmail.com> - 2012-08-11 10:36 +0100
    Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-10 12:34 -0400
      Re: hashCode bob smith <bob@coolfone.comze.com> - 2012-08-10 15:22 -0700
        Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-10 15:32 -0700
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:30 -0400
            Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:24 -0700
              Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:15 -0400
                Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:29 -0400
                  Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 22:43 -0400
                    Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:54 -0400
                      Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 21:46 -0700
                        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 16:53 -0400
                        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:00 -0400
        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:27 -0400
          Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-12 17:06 +0200
            Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 16:59 -0400
              Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-13 19:17 +0200
                Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-19 19:42 -0400
                  Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-21 10:30 +0000
                Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-19 19:47 -0400
                  Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-21 10:43 +0000
                    Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 19:04 -0400
                      Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-27 16:55 -0700
                        Re: hashCode markspace <-@.> - 2012-08-27 17:03 -0700
                          Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-27 17:49 -0700
                          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 21:37 -0400
                          Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-27 18:58 -0700
                            Re: hashCode markspace <-@.> - 2012-08-27 21:19 -0700
                              Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-28 01:06 -0700
                                Re: hashCode markspace <-@.> - 2012-08-28 09:19 -0700
                                  Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-28 16:33 -0700
                                    Re: hashCode markspace <-@.> - 2012-08-28 17:02 -0700
                                      Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 11:06 -0700
                                        Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-29 14:49 -0400
                                          Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 13:40 -0700
                                            Re: hashCode Gene Wirchenko <genew@ocis.net> - 2012-08-29 18:02 -0700
                                          Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-08-31 00:52 +0200
                                            Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-30 21:43 -0400
                                              Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-30 21:52 -0400
                                              Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-08-31 04:18 +0200
                                                Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:08 -0600
                                                  Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-31 11:38 -0400
                                                    Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-31 17:55 +0200
                                                    Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:56 -0600
                                                      Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-31 14:32 -0400
                                                        Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 14:38 -0600
                                                      Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-31 15:33 -0700
                                                        Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 16:41 -0600
                                                          Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-31 16:26 -0700
                                                            Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-09-02 11:54 -0600
                                                              Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-09-03 00:47 +0200
                                                                Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-09-03 21:44 -0600
                                              Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:08 -0600
                                                Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-31 17:58 +0200
                                        Re: hashCode markspace <-@.> - 2012-08-29 11:51 -0700
                                          Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 13:28 -0700
                                            Re: hashCode markspace <-@.> - 2012-08-29 16:05 -0700
                                              Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 16:23 -0700
                                                Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-29 20:56 -0400
                                                  Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-30 11:19 +0200
                                                  Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-30 10:03 -0700
                                                    Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-30 18:34 +0000
                                              Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-30 08:11 -0600
                                                Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-30 10:06 -0700
                                              Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-30 19:16 -0400
                              Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 13:58 -0700
                            Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 13:56 -0700
                              Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-28 14:07 -0700
                                Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 14:38 -0700
                        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 21:12 -0400
        Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 07:58 -0400
          Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:29 -0700
            Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:16 -0400
        Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 03:46 -0700
        Re: hashCode Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-08-12 12:23 -0400
          Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 09:40 -0700
            Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-12 13:59 -0400
              Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 11:17 -0700
            Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:02 -0400
          Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 19:03 +0200
    Re: hashCode Roedy Green <see_website@mindprod.com.invalid> - 2012-08-10 12:17 -0700
      Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-10 12:45 -0700
        Re: hashCode Roedy Green <see_website@mindprod.com.invalid> - 2012-08-11 04:54 -0700
          Re: hashCode Joerg Meier <joergmmeier@arcor.de> - 2012-08-11 18:25 +0200
            Re: hashCode Mike Winter <usenet@michael-winter.me.invalid> - 2012-08-11 18:53 +0100
          Re: hashCode Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2012-08-11 09:56 -0700
          Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 18:58 +0200
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:40 -0400
        Re: hashCode rossum <rossum48@coldmail.com> - 2012-08-11 18:47 +0100
      Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:25 -0400
        Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 08:00 -0400
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 09:49 -0400
    Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 15:33 +0200
      Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 15:34 +0200
      Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:34 -0700
        Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:37 -0700
        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:19 -0400
          Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 21:48 -0700
      Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 12:08 +0200
        Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 12:18 +0200
        Re: hashCode Lew <noone@lewscanon.com> - 2012-08-12 11:27 -0700
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:11 -0400

Page 4 of 6 — ← Prev page 1 2 3 [4] 5 6 Next page →

#18418

From	markspace <-@.>
Date	2012-08-29 16:05 -0700
Message-ID	<k1m78n$73c$1@dont-email.me>
In reply to	#18414

On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>
> The point is that hashing "Object" isn't entirely sensible in most
> situations.

Why not?  I do it all the time.  Put an object in, get an object out. 
Works great.

Seriously, I think you're not being intellectually honest about the 
value of having a built-in hashCode() for every object.

[toc] | [prev] | [next] | [standalone]

#18419

From	Daniel Pitts <newsgroup.nospam@virtualinfinity.net>
Date	2012-08-29 16:23 -0700
Message-ID	<jwx%r.3$_I7.1@newsfe20.iad>
In reply to	#18418

On 8/29/12 4:05 PM, markspace wrote:
> On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>>
>> The point is that hashing "Object" isn't entirely sensible in most
>> situations.
>
> Why not?  I do it all the time.  Put an object in, get an object out.
> Works great.
Do you put an "Object" in, or an "object" in, here is a big difference. 
Also, how often do you put an object in expecting it to be hashed based 
on value vs based on identity? My point was that most of the time the 
expectation is that the hash is based on value, not identity.


> Seriously, I think you're not being intellectually honest about the
> value of having a built-in hashCode() for every object.
And you're not being intellectually honest about the cost of having a 
built-in hashCode() for every object.

I'm not saying there isn't value, but that there would be just as much 
value in using alternative approaches, including using an external 
Hasher. And those approaches provide additional flexibility that isn't 
available in the current library.

[toc] | [prev] | [next] | [standalone]

#18420

From	Eric Sosman <esosman@ieee-dot-org.invalid>
Date	2012-08-29 20:56 -0400
Message-ID	<k1mdn6$akp$1@dont-email.me>
In reply to	#18419

On 8/29/2012 7:23 PM, Daniel Pitts wrote:
> On 8/29/12 4:05 PM, markspace wrote:
>> On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>>>
>>> The point is that hashing "Object" isn't entirely sensible in most
>>> situations.
>>
>> Why not?  I do it all the time.  Put an object in, get an object out.
>> Works great.
> Do you put an "Object" in, or an "object" in, here is a big difference.
> Also, how often do you put an object in expecting it to be hashed based
> on value vs based on identity? My point was that most of the time the
> expectation is that the hash is based on value, not identity.

     Perhaps you're thinking too much about HashMap, and maybe
not enough about HashSet.

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]

#18423

From	Jan Burse <janburse@fastmail.fm>
Date	2012-08-30 11:19 +0200
Message-ID	<k1nb7r$hri$1@news.albasani.net>
In reply to	#18420

Eric Sosman schrieb:
> On 8/29/2012 7:23 PM, Daniel Pitts wrote:
>> On 8/29/12 4:05 PM, markspace wrote:
>>> On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>>>>
>>>> The point is that hashing "Object" isn't entirely sensible in most
>>>> situations.
>>>
>>> Why not?  I do it all the time.  Put an object in, get an object out.
>>> Works great.
>> Do you put an "Object" in, or an "object" in, here is a big difference.
>> Also, how often do you put an object in expecting it to be hashed based
>> on value vs based on identity? My point was that most of the time the
>> expectation is that the hash is based on value, not identity.
>
>      Perhaps you're thinking too much about HashMap, and maybe
> not enough about HashSet.
>

HashSet is built on to of HashMap.
It is dummy object in, dummy object out.

The dummy object is declared as follows:

     // Dummy value to associate with an Object in the backing Map
     private static final Object PRESENT = new Object();

I guess the above implementation is a result of code
"reuse", not in the copy/paste sense, but in the OO
sense. There is no effort to reduce the memory footprint
for HashSet.

Bye

[toc] | [prev] | [next] | [standalone]

#18434

From	Daniel Pitts <newsgroup.nospam@virtualinfinity.net>
Date	2012-08-30 10:03 -0700
Message-ID	<02N%r.248$G01.210@newsfe15.iad>
In reply to	#18420

On 8/29/12 5:56 PM, Eric Sosman wrote:
> On 8/29/2012 7:23 PM, Daniel Pitts wrote:
>> On 8/29/12 4:05 PM, markspace wrote:
>>> On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>>>>
>>>> The point is that hashing "Object" isn't entirely sensible in most
>>>> situations.
>>>
>>> Why not?  I do it all the time.  Put an object in, get an object out.
>>> Works great.
>> Do you put an "Object" in, or an "object" in, here is a big difference.
>> Also, how often do you put an object in expecting it to be hashed based
>> on value vs based on identity? My point was that most of the time the
>> expectation is that the hash is based on value, not identity.
>
>      Perhaps you're thinking too much about HashMap, and maybe
> not enough about HashSet.
>
How so?

HashSet is again often used for value deduplication, not as often 
identity deduplication. My point remains that for most use-cases, having 
hashCode() and equals() on Object isn't necessary and adds clutter.

[toc] | [prev] | [next] | [standalone]

#18440

From	Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at>
Date	2012-08-30 18:34 +0000
Message-ID	<slrnk3vci7.u9l.avl@gamma.logic.tuwien.ac.at>
In reply to	#18434

Daniel Pitts <newsgroup.nospam@virtualinfinity.net> wrote:
> My point remains that for most use-cases, having 
> hashCode() and equals() on Object isn't necessary
> and adds clutter.

I don't agree to this particular point, but I agree on
that a (non-generic) Hasher *interface* and a variant of
HashMap accepting such a Hasher and using that on the
keys instead of the keys' own methods, could be useful.

Hasher's hashCode taking Object could throw ClassCastException
for unsupported objects, which the HashMap could specifically
catch to shortcut the search for such an element in the map.
(Each implementation of Hasher would *tell* its own supported
objects, so that wouldn't be a problem with erasure.)
e.g.: a StringHeadHasher that defined equality on the first n
chars of a String would know what is a String or not, even if
the HashMap<String,...> itself doesn't, for erasure-reasons.

null would be a legal key, *iff* the Hasher supports it.

Such a separate Hasher could also be implemented across
subclass-boundaries (actually it would be automatically,
unless it is done for final classes, or does getClass()
inspection on the objects). Cross-Implementation equalities
are principially already known from List and Set (by their
contracts).

If such had existed from start, then at least there wouldn't
have been a need for a special IdentityHashMap.  I think to
remember coming across other usecases in the past. (Of course,
there was always a workaround - of varying ugli- or roundabout-
ness.)

Anyway, nothing of that sort is likely to happen, so it's just
for the sake of discussion and learning new ideas in the course.

[toc] | [prev] | [next] | [standalone]

#18425

From	Jim Janney <jjanney@shell.xmission.com>
Date	2012-08-30 08:11 -0600
Message-ID	<ydnfw74tqp7.fsf@shell.xmission.com>
In reply to	#18418

markspace <-@.> writes:

> On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>>
>> The point is that hashing "Object" isn't entirely sensible in most
>> situations.
>
> Why not?  I do it all the time.  Put an object in, get an object
> out. Works great.

I've worked on code that failed in mysterious ways because someone used
a HashMap assuming object identity, and then, a few years later, someone
else defined equals on one of the subclasses.  If you need identity
semantics, use java.util.IdentityHashMap: that's what it's for.

In practice, when I use HashMap the keys are almost always Strings, or
else classes designed for use as keys.

-- 
Jim Janney

[toc] | [prev] | [next] | [standalone]

#18435

From	Daniel Pitts <newsgroup.nospam@virtualinfinity.net>
Date	2012-08-30 10:06 -0700
Message-ID	<94N%r.249$G01.27@newsfe15.iad>
In reply to	#18425

On 8/30/12 7:11 AM, Jim Janney wrote:
> markspace <-@.> writes:
>
>> On 8/29/2012 1:28 PM, Daniel Pitts wrote:
>>>
>>> The point is that hashing "Object" isn't entirely sensible in most
>>> situations.
>>
>> Why not?  I do it all the time.  Put an object in, get an object
>> out. Works great.
>
> I've worked on code that failed in mysterious ways because someone used
> a HashMap assuming object identity, and then, a few years later, someone
> else defined equals on one of the subclasses.  If you need identity
> semantics, use java.util.IdentityHashMap: that's what it's for.
>
> In practice, when I use HashMap the keys are almost always Strings, or
> else classes designed for use as keys.
>
This is exactly my point, thanks.  If they are going to be keys in a 
HashMap, or values in a HashSet, they need to be designed appropriately. 
  Providing a "default" implementation that may be overridden in 
subclasses unexpectedly is asking for bugs.

[toc] | [prev] | [next] | [standalone]

#18454

From	Arne Vajhøj <arne@vajhoej.dk>
Date	2012-08-30 19:16 -0400
Message-ID	<503ff438$0$284$14726298@news.sunsite.dk>
In reply to	#18418

On 8/29/2012 7:05 PM, markspace wrote:> On 8/29/2012 1:28 PM, Daniel 
Pitts wrote:
 >> The point is that hashing "Object" isn't entirely sensible in most
 >> situations.
 >
 > Why not?

It will be code that accept all types of objects and some of them
will be distinct for each instance other will implement some type
of value equality.

That code smells.

 >          I do it all the time.  Put an object in, get an object out.
 > Works great.

The discussion is about keys not values.

Arne

[toc] | [prev] | [next] | [standalone]

#18361

From	Lew <lewbloch@gmail.com>
Date	2012-08-28 13:58 -0700
Message-ID	<3cc7cc45-ae25-4d4e-936b-85dde7a17079@googlegroups.com>
In reply to	#18348

markspace wrote:
> Patricia Shanahan wrote:
>> markspace wrote:
>> ...
>>> For example, in C, one can always hash based on memory address.  In Java
>>> we don't have that option, so hashcode() takes the place of the
>>> intrinsic property of an address.
>> ...
> 
>> In Java we have System.identityHashCode() which provides an address-like
>> hash code for any object.
> 
> I think System.identityHashCode() is the (same as the) default 
> implementation for Object::hashCode(), yes?

Why wonder when it's in the Javadocs?

Hm?

It is, in fact, required to be.

> So there's a small bit of evidence in support of the idea that 
> Object::hashCode() is meant to mimic the idea of just hashing on address.

[toc] | [prev] | [next] | [standalone]

#18360

From	Lew <lewbloch@gmail.com>
Date	2012-08-28 13:56 -0700
Message-ID	<8441a30f-65f3-4b94-82cf-d6f4ad859846@googlegroups.com>
In reply to	#18345

Patricia Shanahan wrote:
> markspace wrote:
> 
> ...
> 
>> For example, in C, one can always hash based on memory address.  In Java
>> we don't have that option, so hashcode() takes the place of the
>> intrinsic property of an address.
> 
> ...

> In Java we have System.identityHashCode() which provides an address-like
> hash code for any object.

Which is simply a wrapper method for the 'Object#hashCode()' method.

"Returns the same hash code for the given object as would be returned by the 
default method hashCode(), whether or not the given object's class overrides 
hashCode()."

-- 
Lew

[toc] | [prev] | [next] | [standalone]

#18363

From	Patricia Shanahan <pats@acm.org>
Date	2012-08-28 14:07 -0700
Message-ID	<fPydnYloke2DrqDNnZ2dnUVZ_gednZ2d@earthlink.com>
In reply to	#18360

On 8/28/2012 1:56 PM, Lew wrote:
> Patricia Shanahan wrote:
>> markspace wrote:
>>
>> ...
>>
>>> For example, in C, one can always hash based on memory address.  In Java
>>> we don't have that option, so hashcode() takes the place of the
>>> intrinsic property of an address.
>>
>> ...
>
>> In Java we have System.identityHashCode() which provides an address-like
>> hash code for any object.
>
> Which is simply a wrapper method for the 'Object#hashCode()' method.

Although that is the way round it is documented, they are both native
methods, and either could be a wrapper for the other, or they could both
be wrappers for a common native function.

Patricia

[toc] | [prev] | [next] | [standalone]

#18364

From	Lew <lewbloch@gmail.com>
Date	2012-08-28 14:38 -0700
Message-ID	<8e9daeec-8e40-48f2-8d48-88df8c39d671@googlegroups.com>
In reply to	#18363

Patricia Shanahan wrote:
> Lew wrote:
>> Patricia Shanahan wrote:
>>> markspace wrote:
>>> ...
>>>> For example, in C, one can always hash based on memory address.  In Java
>>>> we don't have that option, so hashcode() takes the place of the
>>>> intrinsic property of an address.
> >> ...
>>> In Java we have System.identityHashCode() which provides an address-like
>>> hash code for any object.
> 
>> Which is simply a wrapper method for the 'Object#hashCode()' method.
> 
> Although that is the way round it is documented, they are both native
> methods, and either could be a wrapper for the other, or they could both
> be wrappers for a common native function.

Point taken. I should have said, "Which simply behaves indistinguishably from 
the 'Object#hashCode()' method for its argument."

It being a literal wrapper is not important, only that it produces the same result 
as 'Object#hashCode()' per its documentation.

Which latter in its Javadocs tells us that it's "typically implemented by converting 
the internal address of the object into an integer".

All of which supports markspace's notion that the hash code is intended as a 
sort-of address for programming purposes.

N.b., the promise of 'hashCode()' explicitly disallows using the result of 
'System.identityHashCode()' as a guaranteed-unique object identifier.

-- 
Lew

[toc] | [prev] | [next] | [standalone]

#18340

From	Arne Vajhøj <arne@vajhoej.dk>
Date	2012-08-27 21:12 -0400
Message-ID	<503c1ae1$0$286$14726298@news.sunsite.dk>
In reply to	#18336

On 8/27/2012 7:55 PM, Daniel Pitts wrote:
> On 8/27/12 4:04 PM, Arne Vajhøj wrote:
>> On 8/21/2012 6:43 AM, Andreas Leitgeb wrote:
>>> Arne Vajhøj <arne@vajhoej.dk> wrote:
>>>> We are looking at two alternatives:
>>>> A) default functions properly but good performance requires an override
>>>> B) default gives good performance but may need an override to function
>>>>       properly
>>>
>>> C) default hashCode() works perfectly well with default equals(), and
>>> only
>>>    those with a specific requirement for equality, who thus need to
>>> override
>>>    .equals(), anyway, also need to override hashCode() appropriately for
>>>    their specific equality-relation.
>>
>> That is just B in another wording.
>
> However you word it.
>
> The truth is that equals/hashCode should probably be overridden
> together, or together remain default.

It should.

It must with the hashCode of today.

Arne

[toc] | [prev] | [next] | [standalone]

#17675

From	Eric Sosman <esosman@ieee-dot-org.invalid>
Date	2012-08-11 07:58 -0400
Message-ID	<k05hd9$uhc$1@dont-email.me>
In reply to	#17660

On 8/10/2012 6:22 PM, bob smith wrote:
[... many blank lines removed for legibility's sake ...]
> On Friday, August 10, 2012 11:34:28 AM UTC-5, Eric Sosman wrote:
>> On 8/10/2012 11:47 AM, bob smith wrote:
>>
>>> Is it always technically correct to override the hashCode function like so:
>>>
>>> 	@Override
>>> 	public int hashCode() {
>>> 		return 1;
>>> 	}
>>>
>>> Would it be potentially better if that was Object's implementation?
>>
>>       Define "better."
>
> Better in the sense that you would never HAVE to override hashCode.
>
> Now, there are cases where you HAVE to override it, or your code is very broken.

     I cannot think of a case where you HAVE to override hashCode(),
except as a consequence of other choices that you didn't HAVE to
make.  You don't HAVE to invent classes where distinct instances
are considered equal, and even if you do you don't HAVE to put those
instances in HashMaps or HashSets or whatever.

     But that's a bit specious: All it says is that you don't HAVE
to override hashCode() because you don't HAVE to use things that
call it.  It's like "You don't HAVE to pay taxes, because you don't
HAVE to live outside prison."  So, let's take it as a given that
you will often need to write classes that override equals() and
hashCode() -- I imagine you understand that they go together.

     Okay: Then returning a constant 1 (or 42 or 0 or whatever)
would in fact satisfy the letter of the law regarding hashCode():
Whenever x.equals(y) is true, x.hashCode() == y.hashCode().  In
your example this would be trivially true because x,y,z,... all
have the same hashCode() value, whether they're equal or not --
You have lived up to the letter of the law.

     Of course, such a hashCode() would make all those hash-based
containers pretty much useless: They would work in the sense that
they would get the Right Answer, but they'd be abominably slow,
with expected performance of O(N) instead of O(1).  See
<http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
for a survey of some denial-of-service attacks that work by driving
hash tables from O(1) to O(N), resulting in catastrophic failure
of the attacked system.

     In other words, the letter of the law on hashCode() is a bare
minimum that guarantees correct functioning, but it is not enough
to guarantee usability.  Why isn't the law more specific?  Because
nobody knows how to write "hashCode() must be correct *and* usable"
in terms that would cover all the classes all the Java programmers
have dreamed up and will dream up.  Your hashCode() meets the bare
minimum requirement, but is not "usable."  The actual hashCode()
provided by Object also meets the bare minimum requirement, and *is*
usable as it stands, until (and unless; you don't HAVE to) you
choose to implement other equals() semantics, and a hashCode() to
match them.


-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]

#17716

From	Lew <noone@lewscanon.com>
Date	2012-08-11 16:29 -0700
Message-ID	<k06psm$h3a$1@news.albasani.net>
In reply to	#17675

Eric Sosman wrote:
>      Okay: Then returning a constant 1 (or 42 or 0 or whatever)
> would in fact satisfy the letter of the law regarding hashCode():

Not if you consider all aspects of what the Javadocs promise.

See my post upthread.

> Whenever x.equals(y) is true, x.hashCode() == y.hashCode().  In
> your example this would be trivially true because x,y,z,... all
> have the same hashCode() value, whether they're equal or not --
> You have lived up to the letter of the law.

No, because the law requires that the method support 'HashMap', which in turn 
calls for "properly" hashed objects.

>      Of course, such a hashCode() would make all those hash-based
> containers pretty much useless: They would work in the sense that
> they would get the Right Answer, but they'd be abominably slow,

Indeed.

> with expected performance of O(N) instead of O(1).  See
> <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
> for a survey of some denial-of-service attacks that work by driving
> hash tables from O(1) to O(N), resulting in catastrophic failure
> of the attacked system.
>
>      In other words, the letter of the law on hashCode() is a bare
> minimum that guarantees correct functioning, but it is not enough
> to guarantee usability.  Why isn't the law more specific?  Because

Actually, if you consider all that the Javadocs tell you, this "letter of the 
law" to which you refer is like saying the sequence "ABC" constitutes all of 
"the ABCs".

> nobody knows how to write "hashCode() must be correct *and* usable"
> in terms that would cover all the classes all the Java programmers
> have dreamed up and will dream up.  Your hashCode() meets the bare
> minimum requirement, but is not "usable."  The actual hashCode()
> provided by Object also meets the bare minimum requirement, and *is*
> usable as it stands, until (and unless; you don't HAVE to) you
> choose to implement other equals() semantics, and a hashCode() to
> match them.

As Arne states, "correct" means "fulfills the specification". The 
specification for Java API methods is the standard Javadocs, which do impose 
performance considerations on 'hashCode()'.

One understands that the spec isn't always fully enforceable by the compiler. 
[1] It is correct that the compiler will allow 'return 1;'. It is not correct 
that that fulfills the specification.

[1] Doesn't one?

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#17721

From	Arne Vajhøj <arne@vajhoej.dk>
Date	2012-08-11 22:16 -0400
Message-ID	<50271219$0$292$14726298@news.sunsite.dk>
In reply to	#17716

On 8/11/2012 7:29 PM, Lew wrote:
> Eric Sosman wrote:
>>      Okay: Then returning a constant 1 (or 42 or 0 or whatever)
>> would in fact satisfy the letter of the law regarding hashCode():
>
> Not if you consider all aspects of what the Javadocs promise.
>
> See my post upthread.
>
>> Whenever x.equals(y) is true, x.hashCode() == y.hashCode().  In
>> your example this would be trivially true because x,y,z,... all
>> have the same hashCode() value, whether they're equal or not --
>> You have lived up to the letter of the law.
>
> No, because the law requires that the method support 'HashMap', which in
> turn calls for "properly" hashed objects.
>
>>      Of course, such a hashCode() would make all those hash-based
>> containers pretty much useless: They would work in the sense that
>> they would get the Right Answer, but they'd be abominably slow,
>
> Indeed.
>
>> with expected performance of O(N) instead of O(1).  See
>> <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
>> for a survey of some denial-of-service attacks that work by driving
>> hash tables from O(1) to O(N), resulting in catastrophic failure
>> of the attacked system.
>>
>>      In other words, the letter of the law on hashCode() is a bare
>> minimum that guarantees correct functioning, but it is not enough
>> to guarantee usability.  Why isn't the law more specific?  Because
>
> Actually, if you consider all that the Javadocs tell you, this "letter
> of the law" to which you refer is like saying the sequence "ABC"
> constitutes all of "the ABCs".
>
>> nobody knows how to write "hashCode() must be correct *and* usable"
>> in terms that would cover all the classes all the Java programmers
>> have dreamed up and will dream up.  Your hashCode() meets the bare
>> minimum requirement, but is not "usable."  The actual hashCode()
>> provided by Object also meets the bare minimum requirement, and *is*
>> usable as it stands, until (and unless; you don't HAVE to) you
>> choose to implement other equals() semantics, and a hashCode() to
>> match them.
>
> As Arne states, "correct" means "fulfills the specification". The
> specification for Java API methods is the standard Javadocs, which do
> impose performance considerations on 'hashCode()'.
>
> One understands that the spec isn't always fully enforceable by the
> compiler. [1] It is correct that the compiler will allow 'return 1;'. It
> is not correct that that fulfills the specification.

It fulfills the spec.

It does not fulfill you bizarre interpretation of "support".

Arne

[toc] | [prev] | [next] | [standalone]

#17732

From	Patricia Shanahan <pats@acm.org>
Date	2012-08-12 03:46 -0700
Message-ID	<icCdnU2SCaLnFLrNnZ2dnUVZ_h-dnZ2d@earthlink.com>
In reply to	#17660

On 8/10/2012 3:22 PM, bob smith wrote:
...
> Better in the sense that you would never HAVE to override hashCode.
>
> Now, there are cases where you HAVE to override it, or your code is very broken.
>

I've decided to go back to this message, because I feel the key issue is
the circumstances under which hashCode would need to be overridden if
Object's version returned a constant, compared to the current situation.

If Object's hashCode returned a constant, in practice anyone using an
object as a key in a hash structure would want it overridden with one
that has at least some chance of using multiple buckets. Without that
property, a HashMap is an over-complicated, space-wasting cousin of a
linked list.

The problem with this is that the programmer who knows that Widget
instances are going to be used as HashMap keys does not necessarily
control the Widget implementation. The programmer writing Widget has no
idea whether it will ever be used as a HashMap key, and therefore no way
of knowing whether it is safe, assuming Widget inherits the Object
equals, to also inherit the Object hashCode.

Now compare to the current situation. The programmer implementing Widget
decides whether to inherit a superclass equals or to write a
Widget-specific equals. That programmer can assume the superclass has a
hashCode that would be effective for a HashMap key, and only has to
override hashCode if they are overriding equals.

In practice, it is a long time since I've written a hashCode manually.
Generally, when I decide to override equals I tell Eclipse to generate
an equals/hashCode pair based on the fields that control whether two
instances are equal. Overriding hashCode is no additional work given
that I would be telling Eclipse to generate an equals based on those
fields anyway.

To me, the current situation seems "better".

Patricia

[toc] | [prev] | [next] | [standalone]

#17734

From	Joshua Cranmer <Pidgeot18@verizon.invalid>
Date	2012-08-12 12:23 -0400
Message-ID	<k08la3$gr7$1@dont-email.me>
In reply to	#17660

On 8/10/2012 6:22 PM, bob smith wrote:
> Better in the sense that you would never HAVE to override hashCode.
>
> Now, there are cases where you HAVE to override it, or your code is very broken.

Returning a constant hash code is correct in the same sense that 
answering "yes" to the question "Can you tell me the correct way to do 
this?" would be--syntactically and semantically correct, but completely 
contrary to the actual intent of the question.

The point of the hash code is to provide a cheap way to quickly 
distinguish inputs (in the sense that Pr(a.hashCode() == b.hashCode() 
and !a.equals(b)) should be as small as possible [1]). A constant-value 
hash completely negates the purpose of the hash code, and this renders 
the hashCode again completely unusable for anything that actually wants 
to use it.

In the default case, a.hashCode() == b.hashCode() only when a == b (this 
definitely holds true with 32-bit machines and I'm pretty sure it still 
holds true with 64-bit machines, but I'd have to reverify the JVM source 
code to be certain). It is thus correct so long as identity equals is 
correct. It is also potentially correct in a limited set of cases where 
a.equals(b) and a != b. In all of these cases, it would not only be 
correct but also extremely useful, having pretty strong guarantees about 
the distribution of hash values.

[1] Actually, for good performance, hash codes should go one step 
further and make slightly stronger guarantees about independence with 
respect to the size of the hash table. But I digress.

-- 
Beware of bugs in the above code; I have only proved it correct, not 
tried it. -- Donald E. Knuth

[toc] | [prev] | [next] | [standalone]

#17735

From	Patricia Shanahan <pats@acm.org>
Date	2012-08-12 09:40 -0700
Message-ID	<9K6dneYcKfvgQbrNnZ2dnUVZ_j-dnZ2d@earthlink.com>
In reply to	#17734

On 8/12/2012 9:23 AM, Joshua Cranmer wrote:
> On 8/10/2012 6:22 PM, bob smith wrote:
>> Better in the sense that you would never HAVE to override hashCode.
>>
>> Now, there are cases where you HAVE to override it, or your code is
>> very broken.
>
> Returning a constant hash code is correct in the same sense that
> answering "yes" to the question "Can you tell me the correct way to do
> this?" would be--syntactically and semantically correct, but completely
> contrary to the actual intent of the question.
>
> The point of the hash code is to provide a cheap way to quickly
> distinguish inputs (in the sense that Pr(a.hashCode() == b.hashCode()
> and !a.equals(b)) should be as small as possible [1]). A constant-value
> hash completely negates the purpose of the hash code, and this renders
> the hashCode again completely unusable for anything that actually wants
> to use it.
>
> In the default case, a.hashCode() == b.hashCode() only when a == b (this
> definitely holds true with 32-bit machines and I'm pretty sure it still
> holds true with 64-bit machines, but I'd have to reverify the JVM source
> code to be certain). It is thus correct so long as identity equals is
> correct. It is also potentially correct in a limited set of cases where
> a.equals(b) and a != b. In all of these cases, it would not only be
> correct but also extremely useful, having pretty strong guarantees about
> the distribution of hash values.
>
> [1] Actually, for good performance, hash codes should go one step
> further and make slightly stronger guarantees about independence with
> respect to the size of the hash table. But I digress.
>

I think there are two reasonably usable ways of handling this issue. One
is the current arrangement, in which every class has a hashCode that is
expected to be usable for selecting a hash table bucket.

Keeping hashCode as an Object method but making it useless for bucket
selection unless overridden would not be a good alternative.

A more reasonable alternative would be to have hashCode as the only
member of a HashKey interface that would be implemented by every class
whose objects are intended to be suitable for use as has keys. Those
objects that have a hashCode would still have to have a usable one, but
some classes would not implement HashKey and not have a hashCode at all.

Patricia

[toc] | [prev] | [next] | [standalone]

Page 4 of 6 — ← Prev page 1 2 3 [4] 5 6 Next page →

csiph-web

hashCode

Contents

#18418

#18419

#18420

#18423

#18434

#18440

#18425

#18435

#18454

#18361

#18360

#18363

#18364

#18340

#17675

#17716

#17721

#17732

#17734

#17735