Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #17577 > unrolled thread

hashCode

Started bybob smith <bob@coolfone.comze.com>
First post2012-08-10 08:47 -0700
Last post2012-08-12 17:11 -0400
Articles 20 on this page of 106 — 20 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  hashCode bob smith <bob@coolfone.comze.com> - 2012-08-10 08:47 -0700
    Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 12:13 -0400
      Re: hashCode markspace <-@.> - 2012-08-10 10:13 -0700
        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 13:38 -0400
          Re: hashCode rossum <rossum48@coldmail.com> - 2012-08-11 10:36 +0100
    Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-10 12:34 -0400
      Re: hashCode bob smith <bob@coolfone.comze.com> - 2012-08-10 15:22 -0700
        Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-10 15:32 -0700
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:30 -0400
            Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:24 -0700
              Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:15 -0400
                Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:29 -0400
                  Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 22:43 -0400
                    Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:54 -0400
                      Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 21:46 -0700
                        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 16:53 -0400
                        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:00 -0400
        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:27 -0400
          Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-12 17:06 +0200
            Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 16:59 -0400
              Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-13 19:17 +0200
                Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-19 19:42 -0400
                  Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-21 10:30 +0000
                Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-19 19:47 -0400
                  Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-21 10:43 +0000
                    Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 19:04 -0400
                      Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-27 16:55 -0700
                        Re: hashCode markspace <-@.> - 2012-08-27 17:03 -0700
                          Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-27 17:49 -0700
                          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 21:37 -0400
                          Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-27 18:58 -0700
                            Re: hashCode markspace <-@.> - 2012-08-27 21:19 -0700
                              Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-28 01:06 -0700
                                Re: hashCode markspace <-@.> - 2012-08-28 09:19 -0700
                                  Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-28 16:33 -0700
                                    Re: hashCode markspace <-@.> - 2012-08-28 17:02 -0700
                                      Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 11:06 -0700
                                        Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-29 14:49 -0400
                                          Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 13:40 -0700
                                            Re: hashCode Gene Wirchenko <genew@ocis.net> - 2012-08-29 18:02 -0700
                                          Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-08-31 00:52 +0200
                                            Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-30 21:43 -0400
                                              Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-30 21:52 -0400
                                              Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-08-31 04:18 +0200
                                                Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:08 -0600
                                                  Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-31 11:38 -0400
                                                    Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-31 17:55 +0200
                                                    Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:56 -0600
                                                      Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-31 14:32 -0400
                                                        Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 14:38 -0600
                                                      Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-31 15:33 -0700
                                                        Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 16:41 -0600
                                                          Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-31 16:26 -0700
                                                            Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-09-02 11:54 -0600
                                                              Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-09-03 00:47 +0200
                                                                Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-09-03 21:44 -0600
                                              Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:08 -0600
                                                Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-31 17:58 +0200
                                        Re: hashCode markspace <-@.> - 2012-08-29 11:51 -0700
                                          Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 13:28 -0700
                                            Re: hashCode markspace <-@.> - 2012-08-29 16:05 -0700
                                              Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 16:23 -0700
                                                Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-29 20:56 -0400
                                                  Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-30 11:19 +0200
                                                  Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-30 10:03 -0700
                                                    Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-30 18:34 +0000
                                              Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-30 08:11 -0600
                                                Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-30 10:06 -0700
                                              Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-30 19:16 -0400
                              Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 13:58 -0700
                            Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 13:56 -0700
                              Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-28 14:07 -0700
                                Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 14:38 -0700
                        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 21:12 -0400
        Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 07:58 -0400
          Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:29 -0700
            Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:16 -0400
        Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 03:46 -0700
        Re: hashCode Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-08-12 12:23 -0400
          Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 09:40 -0700
            Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-12 13:59 -0400
              Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 11:17 -0700
            Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:02 -0400
          Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 19:03 +0200
    Re: hashCode Roedy Green <see_website@mindprod.com.invalid> - 2012-08-10 12:17 -0700
      Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-10 12:45 -0700
        Re: hashCode Roedy Green <see_website@mindprod.com.invalid> - 2012-08-11 04:54 -0700
          Re: hashCode Joerg Meier <joergmmeier@arcor.de> - 2012-08-11 18:25 +0200
            Re: hashCode Mike Winter <usenet@michael-winter.me.invalid> - 2012-08-11 18:53 +0100
          Re: hashCode Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2012-08-11 09:56 -0700
          Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 18:58 +0200
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:40 -0400
        Re: hashCode rossum <rossum48@coldmail.com> - 2012-08-11 18:47 +0100
      Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:25 -0400
        Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 08:00 -0400
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 09:49 -0400
    Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 15:33 +0200
      Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 15:34 +0200
      Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:34 -0700
        Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:37 -0700
        Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:19 -0400
          Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 21:48 -0700
      Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 12:08 +0200
        Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 12:18 +0200
        Re: hashCode Lew <noone@lewscanon.com> - 2012-08-12 11:27 -0700
          Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:11 -0400

Page 5 of 6 — ← Prev page 1 2 3 4 [5] 6  Next page →


#17737

FromEric Sosman <esosman@ieee-dot-org.invalid>
Date2012-08-12 13:59 -0400
Message-ID<k08qtk$j7i$1@dont-email.me>
In reply to#17735
On 8/12/2012 12:40 PM, Patricia Shanahan wrote:
>[...]
> I think there are two reasonably usable ways of handling this issue. One
> is the current arrangement, in which every class has a hashCode that is
> expected to be usable for selecting a hash table bucket.
>
> Keeping hashCode as an Object method but making it useless for bucket
> selection unless overridden would not be a good alternative.
>
> A more reasonable alternative would be to have hashCode as the only
> member of a HashKey interface that would be implemented by every class
> whose objects are intended to be suitable for use as has keys. Those
> objects that have a hashCode would still have to have a usable one, but
> some classes would not implement HashKey and not have a hashCode at all.

     Ugh.  So if J. Random Programmer is too lazy or unimaginative to
write hashCode(), that means I can't use his class as a HashMap key,
or even put instances in a HashSet?  Ugh, again.

     (And, no: I don't think a HashCalculator interface along the lines
of Comparable would save the day.)

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]


#17738

FromPatricia Shanahan <pats@acm.org>
Date2012-08-12 11:17 -0700
Message-ID<t6qdnShJ3pDJbrrNnZ2dnUVZ_rOdnZ2d@earthlink.com>
In reply to#17737
On 8/12/2012 10:59 AM, Eric Sosman wrote:
> On 8/12/2012 12:40 PM, Patricia Shanahan wrote:
>> [...]
>> I think there are two reasonably usable ways of handling this issue. One
>> is the current arrangement, in which every class has a hashCode that is
>> expected to be usable for selecting a hash table bucket.
>>
>> Keeping hashCode as an Object method but making it useless for bucket
>> selection unless overridden would not be a good alternative.
>>
>> A more reasonable alternative would be to have hashCode as the only
>> member of a HashKey interface that would be implemented by every class
>> whose objects are intended to be suitable for use as has keys. Those
>> objects that have a hashCode would still have to have a usable one, but
>> some classes would not implement HashKey and not have a hashCode at all.
>
>      Ugh.  So if J. Random Programmer is too lazy or unimaginative to
> write hashCode(), that means I can't use his class as a HashMap key,
> or even put instances in a HashSet?  Ugh, again.
>
>      (And, no: I don't think a HashCalculator interface along the lines
> of Comparable would save the day.)
>

I'm not saying that it would be better than the current situation, just
better than having hashCode implementations that appear to be there, but
in practice must not be used for hash bucket selection.

Patricia

[toc] | [prev] | [next] | [standalone]


#17767

FromArne Vajhøj <arne@vajhoej.dk>
Date2012-08-12 17:02 -0400
Message-ID<502819ed$0$290$14726298@news.sunsite.dk>
In reply to#17735
On 8/12/2012 12:40 PM, Patricia Shanahan wrote:
> Keeping hashCode as an Object method but making it useless for bucket
> selection unless overridden would not be a good alternative.
>
> A more reasonable alternative would be to have hashCode as the only
> member of a HashKey interface that would be implemented by every class
> whose objects are intended to be suitable for use as has keys. Those
> objects that have a hashCode would still have to have a usable one, but
> some classes would not implement HashKey and not have a hashCode at all.

That would avoid using the hash for classes where it does not make
much sense.

I like it.

Given that there is a gazillion lines of code out there that
stores Object, then it can not be changed.

Arne

[toc] | [prev] | [next] | [standalone]


#17736

FromJan Burse <janburse@fastmail.fm>
Date2012-08-12 19:03 +0200
Message-ID<k08nk7$4rs$1@news.albasani.net>
In reply to#17734
Joshua Cranmer schrieb:
>
> In the default case, a.hashCode() == b.hashCode() only when a == b (this
> definitely holds true with 32-bit machines and I'm pretty sure it still
> holds true with 64-bit machines, but I'd have to reverify the JVM source
> code to be certain).

Not at all. Even not for 32-bit machines. Not only because
the hashCode() usually uses less than 32 bits. But also for
other reasons, namely the internal algorithm how the
hash is produced (although the below bug doesn't reveal
much internals):

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6321873

But in the bottom of the above bug you can find code that
searches for a clash. It can take quite a while, but
it is not excluded. This is what a sample run produced
according to the bug:

62028120: java.lang.Object@9cab16 - java.lang.Object@9cab16

So the 62028120-th new Object produced the same hashcode.

So

[toc] | [prev] | [next] | [standalone]


#17651

FromRoedy Green <see_website@mindprod.com.invalid>
Date2012-08-10 12:17 -0700
Message-ID<qgna285gdrh2ckdr3f7ct2sp9bneaamo94@4ax.com>
In reply to#17577
On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
<bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
who said :

>	@Override
>	public int hashCode() {
>		return 1;
>	}

that's about the worst possible hashCode function.

See http://mindprod.com/jgloss/hashcode.html for tips on how to write
good ones.
-- 
Roedy Green Canadian Mind Products http://mindprod.com
A new scientific truth does not triumph by convincing its opponents and making them see the light,
but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
~ Max Planck 1858-04-23 1947-10-04 

[toc] | [prev] | [next] | [standalone]


#17654

FromLew <lewbloch@gmail.com>
Date2012-08-10 12:45 -0700
Message-ID<feb852c5-c9aa-439a-94eb-339c5e213ccf@googlegroups.com>
In reply to#17651
Roedy Green wrote:
> bob smith wrote, quoted or indirectly quoted someone
> who said :
> 
>>	@Override
> >	public int hashCode() {
> >		return 1;
> >	}
> 
> that's about the worst possible hashCode function.

Normally that's correct, but it's conceivable that one might do it for 
some hackish reason. In most situations where one might do such 
an override as this, one would do better not to override hashCode().

> See http://mindprod.com/jgloss/hashcode.html for tips on how to write
> good ones.

The default of assembling it via the mix-in of attribute hash codes 
using the Knuth constants is usually good enough.

h(object) = Sum(i ∈ 0.. n-1) of ( attribute[i] * pow(31, n - 1 - i) );

or 

public static int calculateHash(Foo arg) {
   int h = 0;

   for ( each attribute of Foo that contributes to 'equals()' )
   {
        h = 31 * h + attribute.hashCode();
   }
   return h;
}

http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Java_hashCode()
http://en.wikipedia.org/wiki/Java_hashCode()#Java

-- 
Lew

[toc] | [prev] | [next] | [standalone]


#17674

FromRoedy Green <see_website@mindprod.com.invalid>
Date2012-08-11 04:54 -0700
Message-ID<hnhc28hld8lc3ln3qe3cg044br2t6r03gr@4ax.com>
In reply to#17654
On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
wrote, quoted or indirectly quoted someone who said :

>        h =3D 31 * h + attribute.hashCode();
>   }
In my essay I recommend XOR which is an inherentely  faster operation
than multiply.  I wonder which actually works out better. If you had a
large number of fields, the multiply effect could fall off the left
hand end.  It is the algorithm used for String which could have  very
long strings, so Sun must have thought of that.
-- 
Roedy Green Canadian Mind Products http://mindprod.com
A new scientific truth does not triumph by convincing its opponents and making them see the light,
but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
~ Max Planck 1858-04-23 1947-10-04 

[toc] | [prev] | [next] | [standalone]


#17680

FromJoerg Meier <joergmmeier@arcor.de>
Date2012-08-11 18:25 +0200
Message-ID<1ps87vxmp0dl$.1mhg63qewbd8q.dlg@40tude.net>
In reply to#17674
On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:

> On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
> wrote, quoted or indirectly quoted someone who said :
>>        h =3D 31 * h + attribute.hashCode();
>>   }
> In my essay I recommend XOR which is an inherentely  faster operation
> than multiply.

Hasn't that been wrong since about the invention of the 80386 processor
family ? Pretty sure by now MUL and XOR both take one cycle and that's it.


Liebe Gruesse,
		Joerg

-- 
Ich lese meine Emails nicht, replies to Email bleiben also leider
ungelesen.

[toc] | [prev] | [next] | [standalone]


#17684

FromMike Winter <usenet@michael-winter.me.invalid>
Date2012-08-11 18:53 +0100
Message-ID<r_wVr.1360100$jg3.1010793@fx23.am4>
In reply to#17680
On 11/08/2012 17:25, Joerg Meier wrote:
> On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:
>[...]
>> In my essay I recommend XOR which is an inherentely  faster operation
>> than multiply.
>
> Hasn't that been wrong since about the invention of the 80386 processor
> family ?

Not that far back: the Pentium required 9-11 cycles to complete a MUL 
instruction compared to 1-3 for XOR (and the like), depending on operand 
locations and widths.

> Pretty sure by now MUL and XOR both take one cycle and that's it.

More-or-less, but the former is still slower for wider operands. 
However, your point is well-taken: it needn't be as much a concern in 
most cases.

-- 
Mike Winter
Replace ".invalid" with ".uk" to reply by e-mail.

[toc] | [prev] | [next] | [standalone]


#17681

FromPeter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com>
Date2012-08-11 09:56 -0700
Message-ID<77f5tv9zfx4u.zdz8a98gwa3$.dlg@40tude.net>
In reply to#17674
On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:

> On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
> wrote, quoted or indirectly quoted someone who said :
> 
>>        h =3D 31 * h + attribute.hashCode();
>>   }
> In my essay I recommend XOR which is an inherentely  faster operation
> than multiply.  I wonder which actually works out better. If you had a
> large number of fields, the multiply effect could fall off the left
> hand end.  It is the algorithm used for String which could have  very
> long strings, so Sun must have thought of that.

Lew's example is better than XOR. I think you should fix your essay.

There are of course even better approaches if one knows something specific
about the input data. But the "multiply previous by prime, add next value"
is a reasonably standard, "pretty good" composition function.

XOR a well-known problems in the context of hash codes: in many situations,
large numbers of combinations of values wind up mapping to the same result.
Consider a pair of integers: any time those values are identical, the
result of XOR is 0. On the other hand, with the multiply-and-add approach,
it's much less common for "typical" integer values to combine to a hash
value of 0.

(Integers are an important scenario, because a 32-bit integer naturally is
its own hash value...there's no reason to try to mix the bits any further
as is done with other data types, so they don't).

Now (unless I've done my math wrong) if your component hash codes are
actually well-distributed across the entire range of a 32-bit integer (i.e.
are completely uncorrelated to each other and have entirely random
distribution), the XOR should work fine.  It's just that it's very
susceptible to creating hash value collisions when the input values aren't
themselves well-distributed and so as a general technique it's not nearly
as good as multiply-and-add.

As far as performance goes, even if the multiply-and-add costs more (and as
Joerg points out, this may only be a 1-cycle vs 2-cycle difference anyway
on that specific operation), your code should not be spending a lot of time
computing hash values in the first place.  If that's a significant
bottleneck, there are better ways to improve performance than worrying
about XOR vs multiply-and-add, especially since a poor hash function can
hurt performance much worse than using the wrong math operation would.

Pete

[toc] | [prev] | [next] | [standalone]


#17682

FromJan Burse <janburse@fastmail.fm>
Date2012-08-11 18:58 +0200
Message-ID<k062us$4ei$1@news.albasani.net>
In reply to#17674
Roedy Green schrieb:
> If you had a
> large number of fields, the multiply effect could fall off the left
> hand end.

Actually this does not happen, since you multiply with 31,
which is 1+2+4+8+16. So that:

     a*31+b = a*16+a*8+a*4+a*2+a+b

So for a HashMap that uses an index = hash & (2^n - 1) (which is
the same as hash mod 2^n), the impact of a will be still seen,
even when it occurs at the very left hand side.

There is some Microsoft C# HashMap implementation which does
not use mod 2^n, but instead some primes. In case the
implementation choses 31 as the designated prime, all
information but for the first field will be lost. But since
mod 2^32 is also applied, this might not be completely true.

For 2^n I don't know exactly how the impact could be
described. I guess in a HashMap with index = hash mod 2^1 the
hash amounts to a parity bit, since the sum in a+b acts like
an xor on the first right hand bit. But 2^n with n>1 the
31 multiplication is a little more crude.

[toc] | [prev] | [next] | [standalone]


#17724

FromArne Vajhøj <arne@vajhoej.dk>
Date2012-08-11 22:40 -0400
Message-ID<502717b0$0$289$14726298@news.sunsite.dk>
In reply to#17674
On 8/11/2012 7:54 AM, Roedy Green wrote:
> On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
> wrote, quoted or indirectly quoted someone who said :
>
>>         h =3D 31 * h + attribute.hashCode();
>>    }
> In my essay I recommend XOR which is an inherentely  faster operation
> than multiply.  I wonder which actually works out better.

Multiply.

XOR has several problems:
- many small values give small result
- same values in different fields give same result
- two identical values give result zero
+ all those I did not think of.

>                                                     If you had a
> large number of fields, the multiply effect could fall off the left
> hand end.  It is the algorithm used for String which could have  very
> long strings, so Sun must have thought of that.

The multiply effect does not fall off the left with a value like 31
(it would with 32).

Arne



[toc] | [prev] | [next] | [standalone]


#17683

Fromrossum <rossum48@coldmail.com>
Date2012-08-11 18:47 +0100
Message-ID<le6d28p18pa5dkols3nli8f2m8q19mhgr1@4ax.com>
In reply to#17654
On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
wrote:

>public static int calculateHash(Foo arg) {
>   int h = 0;
>
>   for ( each attribute of Foo that contributes to 'equals()' )
>   {
>        h = 31 * h + attribute.hashCode();
>   }
>   return h;
>}
Bloch starts with:

  int h = 17;

He says that works beter in cases where the first one or more
attribute.hashCode() values are zero, and hence will not register.

He suggessts any constant non-zero value.

rossum

[toc] | [prev] | [next] | [standalone]


#17668

FromArne Vajhøj <arne@vajhoej.dk>
Date2012-08-10 19:25 -0400
Message-ID<50259874$0$287$14726298@news.sunsite.dk>
In reply to#17651
On 8/10/2012 3:17 PM, Roedy Green wrote:
> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
> who said :
>
>> 	@Override
>> 	public int hashCode() {
>> 		return 1;
>> 	}
>
> that's about the worst possible hashCode function.

Yes, but the posted asked "Is it always technically correct to ..."
not whether is was "best possible".

Arne

[toc] | [prev] | [next] | [standalone]


#17676

FromEric Sosman <esosman@ieee-dot-org.invalid>
Date2012-08-11 08:00 -0400
Message-ID<k05hg6$uhc$2@dont-email.me>
In reply to#17668
On 8/10/2012 7:25 PM, Arne Vajhøj wrote:
> On 8/10/2012 3:17 PM, Roedy Green wrote:
>> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
>> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
>> who said :
>>
>>>     @Override
>>>     public int hashCode() {
>>>         return 1;
>>>     }
>>
>> that's about the worst possible hashCode function.
>
> Yes, but the posted asked "Is it always technically correct to ..."
> not whether is was "best possible".

     He also asked whether it would "be potentially better."

-- 
Eric Sosman
esosman@ieee-dot-org.invalid

[toc] | [prev] | [next] | [standalone]


#17679

FromArne Vajhøj <arne@vajhoej.dk>
Date2012-08-11 09:49 -0400
Message-ID<502662db$0$294$14726298@news.sunsite.dk>
In reply to#17676
On 8/11/2012 8:00 AM, Eric Sosman wrote:
> On 8/10/2012 7:25 PM, Arne Vajhøj wrote:
>> On 8/10/2012 3:17 PM, Roedy Green wrote:
>>> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
>>> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
>>> who said :
>>>
>>>>     @Override
>>>>     public int hashCode() {
>>>>         return 1;
>>>>     }
>>>
>>> that's about the worst possible hashCode function.
>>
>> Yes, but the posted asked "Is it always technically correct to ..."
>> not whether is was "best possible".
>
>      He also asked whether it would "be potentially better."

"better to use Object hashCode" which again should bring the
correctness question before the performance question.

Arne

[toc] | [prev] | [next] | [standalone]


#17677

FromJan Burse <janburse@fastmail.fm>
Date2012-08-11 15:33 +0200
Message-ID<k05muh$cq6$1@news.albasani.net>
In reply to#17577
bob smith schrieb:
> Is it always technically correct to override the hashCode function like so:
>
> 	@Override
> 	public int hashCode() {
> 		return 1;
> 	}
>
> Would it be potentially better if that was Object's implementation?
>

Maybe it would make sense to spell out what the contract
for hashCode() is. Well the contract is simply, the
following invariant should hold:

    /* invariant that should hold */
    if a.equals(b) then a.hashCode()==b.hashCode()

It should be noted that this does not imply:

    /* not implied and thus not required by the invariant */
    if a.hashCode()==b.hashCode() then a.equals(b)

It is also quite unlikely that a hashCode() would satisfy
the later, although the closer it comes to the later, the
better it works for HashMap, etc..

The default objects implementation of hashCode() matches
the default objects impementation of equals(). The default
objcts implementation of equals() is ==. And the default
objects implementation of hashCode() is
System.identityHashCode().

The System identity hash code is stored in the object
and generated by the system. It does not change during
GC although the internal object address might change
during GC. It is only 32bit although internal object
addresses might by 64bit with a corresponding JVM.

Returning a constant, any constant c not only 1, would be
technically correct correct for the default implementation
of the class object. Since it trivially satisfies the
invariant:

     if a.equals(b) then c==c

is trivially true, since c==c is true. But it is not
better. Since you would get very degenerated HashMaps,
etc..

You need to override the hashhCode() when there is danger
that the invariant is not anymore satisified. This is
not the case when equals() is not overridden. So overriding
hashCode() just for fun when equals() is not overriden,
usually doesn't make sense. It will probably only slow
down the hashCode() calculation. So the following:

     hashCode() = sum attr_i * c^i

Is not necessary. But it would be a possible way to go
when equals() were overriden in the following way:

    equals(other) = and_i attr_i.equals(other.attr_i)

The above happens when you turn your object into a container
of other objects irrespective of the own object identity.
But beware if the container contains itself somewhere. This
is why we find in the code for Hashtable the following
complication:

     public synchronized int hashCode() {
     /*
      * This code detects the recursion caused by computing the hash code
      * of a self-referential hash table and prevents the stack overflow
      * that would otherwise result.  This allows certain 1.1-era
      * applets with self-referential hash tables to work.  This code
      * abuses the loadFactor field to do double-duty as a hashCode
      * in progress flag, so as not to worsen the space performance.
      * A negative load factor indicates that hash code computation is
      * in progress.
      */

Interestingly it will return a constant for the object when
it detects a loop. Maybe one could do better... Dunno

Bye

[toc] | [prev] | [next] | [standalone]


#17678

FromJan Burse <janburse@fastmail.fm>
Date2012-08-11 15:34 +0200
Message-ID<k05n0o$cq6$2@news.albasani.net>
In reply to#17677
Jan Burse schrieb:
> during GC. It is only 32bit although internal object
> addresses might by 64bit with a corresponding JVM.

Typically even less bits, since the same space
is used for some object flags.

[toc] | [prev] | [next] | [standalone]


#17717

FromLew <noone@lewscanon.com>
Date2012-08-11 16:34 -0700
Message-ID<k06q6s$h3a$2@news.albasani.net>
In reply to#17677
Jan Burse wrote:
> Maybe it would make sense to spell out what the contract
> for hashCode() is. Well the contract is simply, the
> following invariant should hold:
>
>     /* invariant that should hold */
>     if a.equals(b) then a.hashCode()==b.hashCode()

True, but if you read the specification for 'hashCode()' fully, that is not 
the entire contract, only the compiler-enforceable part of it.

The entire specification requires that as much as feasible, the 'Object' 
implementation distinguish distinct instances, and that the method generally 
support 'HashMap', which promises O(1) 'get()' and 'put()' with a "proper" 
(i.e., compliant) 'hashCode()'.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]


#17718

FromLew <noone@lewscanon.com>
Date2012-08-11 16:37 -0700
Message-ID<k06qas$h3a$3@news.albasani.net>
In reply to#17717
Lew wrote:
> Jan Burse wrote:
>> Maybe it would make sense to spell out what the contract
>> for hashCode() is. Well the contract is simply, the
>> following invariant should hold:
>>
>>     /* invariant that should hold */
>>     if a.equals(b) then a.hashCode()==b.hashCode()
>
> True, but if you read the specification for 'hashCode()' fully, that is not
> the entire contract, only the compiler-enforceable part of it.

Oooops!

I made a mistake.

Not even that is compiler-enforceable.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]


Page 5 of 6 — ← Prev page 1 2 3 4 [5] 6  Next page →

Back to top | Article view | comp.lang.java.programmer


csiph-web