Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #17577 > unrolled thread
| Started by | bob smith <bob@coolfone.comze.com> |
|---|---|
| First post | 2012-08-10 08:47 -0700 |
| Last post | 2012-08-12 17:11 -0400 |
| Articles | 20 on this page of 106 — 20 participants |
Back to article view | Back to comp.lang.java.programmer
hashCode bob smith <bob@coolfone.comze.com> - 2012-08-10 08:47 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 12:13 -0400
Re: hashCode markspace <-@.> - 2012-08-10 10:13 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 13:38 -0400
Re: hashCode rossum <rossum48@coldmail.com> - 2012-08-11 10:36 +0100
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-10 12:34 -0400
Re: hashCode bob smith <bob@coolfone.comze.com> - 2012-08-10 15:22 -0700
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-10 15:32 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:30 -0400
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:24 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:15 -0400
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:29 -0400
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 22:43 -0400
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:54 -0400
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 21:46 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 16:53 -0400
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:00 -0400
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:27 -0400
Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-12 17:06 +0200
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 16:59 -0400
Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-13 19:17 +0200
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-19 19:42 -0400
Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-21 10:30 +0000
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-19 19:47 -0400
Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-21 10:43 +0000
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 19:04 -0400
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-27 16:55 -0700
Re: hashCode markspace <-@.> - 2012-08-27 17:03 -0700
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-27 17:49 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 21:37 -0400
Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-27 18:58 -0700
Re: hashCode markspace <-@.> - 2012-08-27 21:19 -0700
Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-28 01:06 -0700
Re: hashCode markspace <-@.> - 2012-08-28 09:19 -0700
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-28 16:33 -0700
Re: hashCode markspace <-@.> - 2012-08-28 17:02 -0700
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 11:06 -0700
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-29 14:49 -0400
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 13:40 -0700
Re: hashCode Gene Wirchenko <genew@ocis.net> - 2012-08-29 18:02 -0700
Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-08-31 00:52 +0200
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-30 21:43 -0400
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-30 21:52 -0400
Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-08-31 04:18 +0200
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:08 -0600
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-31 11:38 -0400
Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-31 17:55 +0200
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:56 -0600
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-31 14:32 -0400
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 14:38 -0600
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-31 15:33 -0700
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 16:41 -0600
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-31 16:26 -0700
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-09-02 11:54 -0600
Re: hashCode Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-09-03 00:47 +0200
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-09-03 21:44 -0600
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-31 09:08 -0600
Re: hashCode Robert Klemme <shortcutter@googlemail.com> - 2012-08-31 17:58 +0200
Re: hashCode markspace <-@.> - 2012-08-29 11:51 -0700
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 13:28 -0700
Re: hashCode markspace <-@.> - 2012-08-29 16:05 -0700
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-29 16:23 -0700
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-29 20:56 -0400
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-30 11:19 +0200
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-30 10:03 -0700
Re: hashCode Andreas Leitgeb <avl@gamma.logic.tuwien.ac.at> - 2012-08-30 18:34 +0000
Re: hashCode Jim Janney <jjanney@shell.xmission.com> - 2012-08-30 08:11 -0600
Re: hashCode Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-08-30 10:06 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-30 19:16 -0400
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 13:58 -0700
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 13:56 -0700
Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-28 14:07 -0700
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-28 14:38 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-27 21:12 -0400
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 07:58 -0400
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:29 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:16 -0400
Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 03:46 -0700
Re: hashCode Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-08-12 12:23 -0400
Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 09:40 -0700
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-12 13:59 -0400
Re: hashCode Patricia Shanahan <pats@acm.org> - 2012-08-12 11:17 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:02 -0400
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 19:03 +0200
Re: hashCode Roedy Green <see_website@mindprod.com.invalid> - 2012-08-10 12:17 -0700
Re: hashCode Lew <lewbloch@gmail.com> - 2012-08-10 12:45 -0700
Re: hashCode Roedy Green <see_website@mindprod.com.invalid> - 2012-08-11 04:54 -0700
Re: hashCode Joerg Meier <joergmmeier@arcor.de> - 2012-08-11 18:25 +0200
Re: hashCode Mike Winter <usenet@michael-winter.me.invalid> - 2012-08-11 18:53 +0100
Re: hashCode Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2012-08-11 09:56 -0700
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 18:58 +0200
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:40 -0400
Re: hashCode rossum <rossum48@coldmail.com> - 2012-08-11 18:47 +0100
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-10 19:25 -0400
Re: hashCode Eric Sosman <esosman@ieee-dot-org.invalid> - 2012-08-11 08:00 -0400
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 09:49 -0400
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 15:33 +0200
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-11 15:34 +0200
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:34 -0700
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 16:37 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-11 22:19 -0400
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-11 21:48 -0700
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 12:08 +0200
Re: hashCode Jan Burse <janburse@fastmail.fm> - 2012-08-12 12:18 +0200
Re: hashCode Lew <noone@lewscanon.com> - 2012-08-12 11:27 -0700
Re: hashCode Arne Vajhøj <arne@vajhoej.dk> - 2012-08-12 17:11 -0400
Page 5 of 6 — ← Prev page 1 2 3 4 [5] 6 Next page →
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Date | 2012-08-12 13:59 -0400 |
| Message-ID | <k08qtk$j7i$1@dont-email.me> |
| In reply to | #17735 |
On 8/12/2012 12:40 PM, Patricia Shanahan wrote:
>[...]
> I think there are two reasonably usable ways of handling this issue. One
> is the current arrangement, in which every class has a hashCode that is
> expected to be usable for selecting a hash table bucket.
>
> Keeping hashCode as an Object method but making it useless for bucket
> selection unless overridden would not be a good alternative.
>
> A more reasonable alternative would be to have hashCode as the only
> member of a HashKey interface that would be implemented by every class
> whose objects are intended to be suitable for use as has keys. Those
> objects that have a hashCode would still have to have a usable one, but
> some classes would not implement HashKey and not have a hashCode at all.
Ugh. So if J. Random Programmer is too lazy or unimaginative to
write hashCode(), that means I can't use his class as a HashMap key,
or even put instances in a HashSet? Ugh, again.
(And, no: I don't think a HashCalculator interface along the lines
of Comparable would save the day.)
--
Eric Sosman
esosman@ieee-dot-org.invalid
[toc] | [prev] | [next] | [standalone]
| From | Patricia Shanahan <pats@acm.org> |
|---|---|
| Date | 2012-08-12 11:17 -0700 |
| Message-ID | <t6qdnShJ3pDJbrrNnZ2dnUVZ_rOdnZ2d@earthlink.com> |
| In reply to | #17737 |
On 8/12/2012 10:59 AM, Eric Sosman wrote: > On 8/12/2012 12:40 PM, Patricia Shanahan wrote: >> [...] >> I think there are two reasonably usable ways of handling this issue. One >> is the current arrangement, in which every class has a hashCode that is >> expected to be usable for selecting a hash table bucket. >> >> Keeping hashCode as an Object method but making it useless for bucket >> selection unless overridden would not be a good alternative. >> >> A more reasonable alternative would be to have hashCode as the only >> member of a HashKey interface that would be implemented by every class >> whose objects are intended to be suitable for use as has keys. Those >> objects that have a hashCode would still have to have a usable one, but >> some classes would not implement HashKey and not have a hashCode at all. > > Ugh. So if J. Random Programmer is too lazy or unimaginative to > write hashCode(), that means I can't use his class as a HashMap key, > or even put instances in a HashSet? Ugh, again. > > (And, no: I don't think a HashCalculator interface along the lines > of Comparable would save the day.) > I'm not saying that it would be better than the current situation, just better than having hashCode implementations that appear to be there, but in practice must not be used for hash bucket selection. Patricia
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-08-12 17:02 -0400 |
| Message-ID | <502819ed$0$290$14726298@news.sunsite.dk> |
| In reply to | #17735 |
On 8/12/2012 12:40 PM, Patricia Shanahan wrote: > Keeping hashCode as an Object method but making it useless for bucket > selection unless overridden would not be a good alternative. > > A more reasonable alternative would be to have hashCode as the only > member of a HashKey interface that would be implemented by every class > whose objects are intended to be suitable for use as has keys. Those > objects that have a hashCode would still have to have a usable one, but > some classes would not implement HashKey and not have a hashCode at all. That would avoid using the hash for classes where it does not make much sense. I like it. Given that there is a gazillion lines of code out there that stores Object, then it can not be changed. Arne
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2012-08-12 19:03 +0200 |
| Message-ID | <k08nk7$4rs$1@news.albasani.net> |
| In reply to | #17734 |
Joshua Cranmer schrieb: > > In the default case, a.hashCode() == b.hashCode() only when a == b (this > definitely holds true with 32-bit machines and I'm pretty sure it still > holds true with 64-bit machines, but I'd have to reverify the JVM source > code to be certain). Not at all. Even not for 32-bit machines. Not only because the hashCode() usually uses less than 32 bits. But also for other reasons, namely the internal algorithm how the hash is produced (although the below bug doesn't reveal much internals): http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6321873 But in the bottom of the above bug you can find code that searches for a clash. It can take quite a while, but it is not excluded. This is what a sample run produced according to the bug: 62028120: java.lang.Object@9cab16 - java.lang.Object@9cab16 So the 62028120-th new Object produced the same hashcode. So
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2012-08-10 12:17 -0700 |
| Message-ID | <qgna285gdrh2ckdr3f7ct2sp9bneaamo94@4ax.com> |
| In reply to | #17577 |
On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
<bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
who said :
> @Override
> public int hashCode() {
> return 1;
> }
that's about the worst possible hashCode function.
See http://mindprod.com/jgloss/hashcode.html for tips on how to write
good ones.
--
Roedy Green Canadian Mind Products http://mindprod.com
A new scientific truth does not triumph by convincing its opponents and making them see the light,
but rather because its opponents eventually die, and a new generation grows up that is familiar with it.
~ Max Planck 1858-04-23 1947-10-04
[toc] | [prev] | [next] | [standalone]
| From | Lew <lewbloch@gmail.com> |
|---|---|
| Date | 2012-08-10 12:45 -0700 |
| Message-ID | <feb852c5-c9aa-439a-94eb-339c5e213ccf@googlegroups.com> |
| In reply to | #17651 |
Roedy Green wrote:
> bob smith wrote, quoted or indirectly quoted someone
> who said :
>
>> @Override
> > public int hashCode() {
> > return 1;
> > }
>
> that's about the worst possible hashCode function.
Normally that's correct, but it's conceivable that one might do it for
some hackish reason. In most situations where one might do such
an override as this, one would do better not to override hashCode().
> See http://mindprod.com/jgloss/hashcode.html for tips on how to write
> good ones.
The default of assembling it via the mix-in of attribute hash codes
using the Knuth constants is usually good enough.
h(object) = Sum(i ∈ 0.. n-1) of ( attribute[i] * pow(31, n - 1 - i) );
or
public static int calculateHash(Foo arg) {
int h = 0;
for ( each attribute of Foo that contributes to 'equals()' )
{
h = 31 * h + attribute.hashCode();
}
return h;
}
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Java_hashCode()
http://en.wikipedia.org/wiki/Java_hashCode()#Java
--
Lew
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2012-08-11 04:54 -0700 |
| Message-ID | <hnhc28hld8lc3ln3qe3cg044br2t6r03gr@4ax.com> |
| In reply to | #17654 |
On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> wrote, quoted or indirectly quoted someone who said : > h =3D 31 * h + attribute.hashCode(); > } In my essay I recommend XOR which is an inherentely faster operation than multiply. I wonder which actually works out better. If you had a large number of fields, the multiply effect could fall off the left hand end. It is the algorithm used for String which could have very long strings, so Sun must have thought of that. -- Roedy Green Canadian Mind Products http://mindprod.com A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it. ~ Max Planck 1858-04-23 1947-10-04
[toc] | [prev] | [next] | [standalone]
| From | Joerg Meier <joergmmeier@arcor.de> |
|---|---|
| Date | 2012-08-11 18:25 +0200 |
| Message-ID | <1ps87vxmp0dl$.1mhg63qewbd8q.dlg@40tude.net> |
| In reply to | #17674 |
On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote: > On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> > wrote, quoted or indirectly quoted someone who said : >> h =3D 31 * h + attribute.hashCode(); >> } > In my essay I recommend XOR which is an inherentely faster operation > than multiply. Hasn't that been wrong since about the invention of the 80386 processor family ? Pretty sure by now MUL and XOR both take one cycle and that's it. Liebe Gruesse, Joerg -- Ich lese meine Emails nicht, replies to Email bleiben also leider ungelesen.
[toc] | [prev] | [next] | [standalone]
| From | Mike Winter <usenet@michael-winter.me.invalid> |
|---|---|
| Date | 2012-08-11 18:53 +0100 |
| Message-ID | <r_wVr.1360100$jg3.1010793@fx23.am4> |
| In reply to | #17680 |
On 11/08/2012 17:25, Joerg Meier wrote: > On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote: >[...] >> In my essay I recommend XOR which is an inherentely faster operation >> than multiply. > > Hasn't that been wrong since about the invention of the 80386 processor > family ? Not that far back: the Pentium required 9-11 cycles to complete a MUL instruction compared to 1-3 for XOR (and the like), depending on operand locations and widths. > Pretty sure by now MUL and XOR both take one cycle and that's it. More-or-less, but the former is still slower for wider operands. However, your point is well-taken: it needn't be as much a concern in most cases. -- Mike Winter Replace ".invalid" with ".uk" to reply by e-mail.
[toc] | [prev] | [next] | [standalone]
| From | Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> |
|---|---|
| Date | 2012-08-11 09:56 -0700 |
| Message-ID | <77f5tv9zfx4u.zdz8a98gwa3$.dlg@40tude.net> |
| In reply to | #17674 |
On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote: > On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> > wrote, quoted or indirectly quoted someone who said : > >> h =3D 31 * h + attribute.hashCode(); >> } > In my essay I recommend XOR which is an inherentely faster operation > than multiply. I wonder which actually works out better. If you had a > large number of fields, the multiply effect could fall off the left > hand end. It is the algorithm used for String which could have very > long strings, so Sun must have thought of that. Lew's example is better than XOR. I think you should fix your essay. There are of course even better approaches if one knows something specific about the input data. But the "multiply previous by prime, add next value" is a reasonably standard, "pretty good" composition function. XOR a well-known problems in the context of hash codes: in many situations, large numbers of combinations of values wind up mapping to the same result. Consider a pair of integers: any time those values are identical, the result of XOR is 0. On the other hand, with the multiply-and-add approach, it's much less common for "typical" integer values to combine to a hash value of 0. (Integers are an important scenario, because a 32-bit integer naturally is its own hash value...there's no reason to try to mix the bits any further as is done with other data types, so they don't). Now (unless I've done my math wrong) if your component hash codes are actually well-distributed across the entire range of a 32-bit integer (i.e. are completely uncorrelated to each other and have entirely random distribution), the XOR should work fine. It's just that it's very susceptible to creating hash value collisions when the input values aren't themselves well-distributed and so as a general technique it's not nearly as good as multiply-and-add. As far as performance goes, even if the multiply-and-add costs more (and as Joerg points out, this may only be a 1-cycle vs 2-cycle difference anyway on that specific operation), your code should not be spending a lot of time computing hash values in the first place. If that's a significant bottleneck, there are better ways to improve performance than worrying about XOR vs multiply-and-add, especially since a poor hash function can hurt performance much worse than using the wrong math operation would. Pete
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2012-08-11 18:58 +0200 |
| Message-ID | <k062us$4ei$1@news.albasani.net> |
| In reply to | #17674 |
Roedy Green schrieb:
> If you had a
> large number of fields, the multiply effect could fall off the left
> hand end.
Actually this does not happen, since you multiply with 31,
which is 1+2+4+8+16. So that:
a*31+b = a*16+a*8+a*4+a*2+a+b
So for a HashMap that uses an index = hash & (2^n - 1) (which is
the same as hash mod 2^n), the impact of a will be still seen,
even when it occurs at the very left hand side.
There is some Microsoft C# HashMap implementation which does
not use mod 2^n, but instead some primes. In case the
implementation choses 31 as the designated prime, all
information but for the first field will be lost. But since
mod 2^32 is also applied, this might not be completely true.
For 2^n I don't know exactly how the impact could be
described. I guess in a HashMap with index = hash mod 2^1 the
hash amounts to a parity bit, since the sum in a+b acts like
an xor on the first right hand bit. But 2^n with n>1 the
31 multiplication is a little more crude.
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-08-11 22:40 -0400 |
| Message-ID | <502717b0$0$289$14726298@news.sunsite.dk> |
| In reply to | #17674 |
On 8/11/2012 7:54 AM, Roedy Green wrote: > On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> > wrote, quoted or indirectly quoted someone who said : > >> h =3D 31 * h + attribute.hashCode(); >> } > In my essay I recommend XOR which is an inherentely faster operation > than multiply. I wonder which actually works out better. Multiply. XOR has several problems: - many small values give small result - same values in different fields give same result - two identical values give result zero + all those I did not think of. > If you had a > large number of fields, the multiply effect could fall off the left > hand end. It is the algorithm used for String which could have very > long strings, so Sun must have thought of that. The multiply effect does not fall off the left with a value like 31 (it would with 32). Arne
[toc] | [prev] | [next] | [standalone]
| From | rossum <rossum48@coldmail.com> |
|---|---|
| Date | 2012-08-11 18:47 +0100 |
| Message-ID | <le6d28p18pa5dkols3nli8f2m8q19mhgr1@4ax.com> |
| In reply to | #17654 |
On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
wrote:
>public static int calculateHash(Foo arg) {
> int h = 0;
>
> for ( each attribute of Foo that contributes to 'equals()' )
> {
> h = 31 * h + attribute.hashCode();
> }
> return h;
>}
Bloch starts with:
int h = 17;
He says that works beter in cases where the first one or more
attribute.hashCode() values are zero, and hence will not register.
He suggessts any constant non-zero value.
rossum
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-08-10 19:25 -0400 |
| Message-ID | <50259874$0$287$14726298@news.sunsite.dk> |
| In reply to | #17651 |
On 8/10/2012 3:17 PM, Roedy Green wrote:
> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
> who said :
>
>> @Override
>> public int hashCode() {
>> return 1;
>> }
>
> that's about the worst possible hashCode function.
Yes, but the posted asked "Is it always technically correct to ..."
not whether is was "best possible".
Arne
[toc] | [prev] | [next] | [standalone]
| From | Eric Sosman <esosman@ieee-dot-org.invalid> |
|---|---|
| Date | 2012-08-11 08:00 -0400 |
| Message-ID | <k05hg6$uhc$2@dont-email.me> |
| In reply to | #17668 |
On 8/10/2012 7:25 PM, Arne Vajhøj wrote:
> On 8/10/2012 3:17 PM, Roedy Green wrote:
>> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
>> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
>> who said :
>>
>>> @Override
>>> public int hashCode() {
>>> return 1;
>>> }
>>
>> that's about the worst possible hashCode function.
>
> Yes, but the posted asked "Is it always technically correct to ..."
> not whether is was "best possible".
He also asked whether it would "be potentially better."
--
Eric Sosman
esosman@ieee-dot-org.invalid
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2012-08-11 09:49 -0400 |
| Message-ID | <502662db$0$294$14726298@news.sunsite.dk> |
| In reply to | #17676 |
On 8/11/2012 8:00 AM, Eric Sosman wrote:
> On 8/10/2012 7:25 PM, Arne Vajhøj wrote:
>> On 8/10/2012 3:17 PM, Roedy Green wrote:
>>> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
>>> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
>>> who said :
>>>
>>>> @Override
>>>> public int hashCode() {
>>>> return 1;
>>>> }
>>>
>>> that's about the worst possible hashCode function.
>>
>> Yes, but the posted asked "Is it always technically correct to ..."
>> not whether is was "best possible".
>
> He also asked whether it would "be potentially better."
"better to use Object hashCode" which again should bring the
correctness question before the performance question.
Arne
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2012-08-11 15:33 +0200 |
| Message-ID | <k05muh$cq6$1@news.albasani.net> |
| In reply to | #17577 |
bob smith schrieb:
> Is it always technically correct to override the hashCode function like so:
>
> @Override
> public int hashCode() {
> return 1;
> }
>
> Would it be potentially better if that was Object's implementation?
>
Maybe it would make sense to spell out what the contract
for hashCode() is. Well the contract is simply, the
following invariant should hold:
/* invariant that should hold */
if a.equals(b) then a.hashCode()==b.hashCode()
It should be noted that this does not imply:
/* not implied and thus not required by the invariant */
if a.hashCode()==b.hashCode() then a.equals(b)
It is also quite unlikely that a hashCode() would satisfy
the later, although the closer it comes to the later, the
better it works for HashMap, etc..
The default objects implementation of hashCode() matches
the default objects impementation of equals(). The default
objcts implementation of equals() is ==. And the default
objects implementation of hashCode() is
System.identityHashCode().
The System identity hash code is stored in the object
and generated by the system. It does not change during
GC although the internal object address might change
during GC. It is only 32bit although internal object
addresses might by 64bit with a corresponding JVM.
Returning a constant, any constant c not only 1, would be
technically correct correct for the default implementation
of the class object. Since it trivially satisfies the
invariant:
if a.equals(b) then c==c
is trivially true, since c==c is true. But it is not
better. Since you would get very degenerated HashMaps,
etc..
You need to override the hashhCode() when there is danger
that the invariant is not anymore satisified. This is
not the case when equals() is not overridden. So overriding
hashCode() just for fun when equals() is not overriden,
usually doesn't make sense. It will probably only slow
down the hashCode() calculation. So the following:
hashCode() = sum attr_i * c^i
Is not necessary. But it would be a possible way to go
when equals() were overriden in the following way:
equals(other) = and_i attr_i.equals(other.attr_i)
The above happens when you turn your object into a container
of other objects irrespective of the own object identity.
But beware if the container contains itself somewhere. This
is why we find in the code for Hashtable the following
complication:
public synchronized int hashCode() {
/*
* This code detects the recursion caused by computing the hash code
* of a self-referential hash table and prevents the stack overflow
* that would otherwise result. This allows certain 1.1-era
* applets with self-referential hash tables to work. This code
* abuses the loadFactor field to do double-duty as a hashCode
* in progress flag, so as not to worsen the space performance.
* A negative load factor indicates that hash code computation is
* in progress.
*/
Interestingly it will return a constant for the object when
it detects a loop. Maybe one could do better... Dunno
Bye
[toc] | [prev] | [next] | [standalone]
| From | Jan Burse <janburse@fastmail.fm> |
|---|---|
| Date | 2012-08-11 15:34 +0200 |
| Message-ID | <k05n0o$cq6$2@news.albasani.net> |
| In reply to | #17677 |
Jan Burse schrieb: > during GC. It is only 32bit although internal object > addresses might by 64bit with a corresponding JVM. Typically even less bits, since the same space is used for some object flags.
[toc] | [prev] | [next] | [standalone]
| From | Lew <noone@lewscanon.com> |
|---|---|
| Date | 2012-08-11 16:34 -0700 |
| Message-ID | <k06q6s$h3a$2@news.albasani.net> |
| In reply to | #17677 |
Jan Burse wrote: > Maybe it would make sense to spell out what the contract > for hashCode() is. Well the contract is simply, the > following invariant should hold: > > /* invariant that should hold */ > if a.equals(b) then a.hashCode()==b.hashCode() True, but if you read the specification for 'hashCode()' fully, that is not the entire contract, only the compiler-enforceable part of it. The entire specification requires that as much as feasible, the 'Object' implementation distinguish distinct instances, and that the method generally support 'HashMap', which promises O(1) 'get()' and 'put()' with a "proper" (i.e., compliant) 'hashCode()'. -- Lew Honi soit qui mal y pense. http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
[toc] | [prev] | [next] | [standalone]
| From | Lew <noone@lewscanon.com> |
|---|---|
| Date | 2012-08-11 16:37 -0700 |
| Message-ID | <k06qas$h3a$3@news.albasani.net> |
| In reply to | #17717 |
Lew wrote: > Jan Burse wrote: >> Maybe it would make sense to spell out what the contract >> for hashCode() is. Well the contract is simply, the >> following invariant should hold: >> >> /* invariant that should hold */ >> if a.equals(b) then a.hashCode()==b.hashCode() > > True, but if you read the specification for 'hashCode()' fully, that is not > the entire contract, only the compiler-enforceable part of it. Oooops! I made a mistake. Not even that is compiler-enforceable. -- Lew Honi soit qui mal y pense. http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg
[toc] | [prev] | [next] | [standalone]
Page 5 of 6 — ← Prev page 1 2 3 4 [5] 6 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web