Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #17644 > unrolled thread
| Started by | "bob smith" <bob.smith@1:261/38.remove-t9h-this> |
|---|---|
| First post | 2012-08-10 18:39 +0000 |
| Last post | 2012-08-13 18:36 +0000 |
| Articles | 20 on this page of 52 — 30 participants |
Back to article view | Back to comp.lang.java.programmer
hashCode "bob smith" <bob.smith@1:261/38.remove-t9h-this> - 2012-08-10 18:39 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-t9h-this> - 2012-08-10 18:39 +0000
Re: hashCode "markspace" <markspace@1:261/38.remove-t9h-this> - 2012-08-10 18:39 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-t9h-this> - 2012-08-10 18:39 +0000
Re: hashCode "rossum" <rossum@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Eric Sosman" <eric.sosman@1:261/38.remove-t9h-this> - 2012-08-10 18:39 +0000
Re: hashCode "bob smith" <bob.smith@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Eric Sosman" <eric.sosman@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Eric Sosman" <eric.sosman@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Robert Klemme" <robert.klemme@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-nlb-this> - 2012-08-13 18:36 +0000
Re: hashCode "Robert Klemme" <robert.klemme@1:261/38.remove-nlb-this> - 2012-08-13 18:36 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-fzq-this> - 2012-08-20 18:58 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-fzq-this> - 2012-08-20 18:58 +0000
Re: hashCode "Patricia Shanahan" <patricia.shanahan@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Joshua Cranmer" <joshua.cranmer@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Patricia Shanahan" <patricia.shanahan@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Eric Sosman" <eric.sosman@1:261/38.remove-6fd-this> - 2012-08-12 20:00 +0000
Re: hashCode "Patricia Shanahan" <patricia.shanahan@1:261/38.remove-6fd-this> - 2012-08-12 20:00 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-nlb-this> - 2012-08-13 18:36 +0000
Re: hashCode "Jan Burse" <jan.burse@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Roedy Green" <roedy.green@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Roedy Green" <roedy.green@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Joerg Meier" <joerg.meier@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Mike Winter" <mike.winter@1:261/38.remove-ya-this> - 2012-08-11 19:19 +0000
Re: hashCode "Peter Duniho" <peter.duniho@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Jan Burse" <jan.burse@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "rossum" <rossum@1:261/38.remove-ya-this> - 2012-08-11 19:19 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Eric Sosman" <eric.sosman@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Jan Burse" <jan.burse@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Jan Burse" <jan.burse@1:261/38.remove-pjg-this> - 2012-08-11 18:17 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Jan Burse" <jan.burse@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Jan Burse" <jan.burse@1:261/38.remove-m2z-this> - 2012-08-12 18:58 +0000
Re: hashCode "Lew" <lew@1:261/38.remove-6fd-this> - 2012-08-12 20:00 +0000
Re: hashCode "Arne Vajhøj" <������
høj@1:261/38.remove-nlb-this> - 2012-08-13 18:36 +0000
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
| From | "Robert Klemme" <robert.klemme@1:261/38.remove-nlb-this> |
|---|---|
| Date | 2012-08-13 18:36 +0000 |
| Message-ID | <502943B8.56784.calajapr@time.synchro.net> |
| In reply to | #17807 |
To: Arne Vajhøj
From: Robert Klemme <shortcutter@googlemail.com>
On 12.08.2012 22:59, Arne Vajhoj wrote:
> On 8/12/2012 11:06 AM, Robert Klemme wrote:
>> On 11.08.2012 01:27, Arne Vajhoj wrote:
>>> On 8/10/2012 6:22 PM, bob smith wrote:
>>>> On Friday, August 10, 2012 11:34:28 AM UTC-5, Eric Sosman wrote:
>>>>> On 8/10/2012 11:47 AM, bob smith wrote:
>>>>>> Is it always technically correct to override the hashCode function
>>>>>> like so:
>>>>>> @Override
>>>>>> public int hashCode() {
>>>>>> return 1;
>>>>>> }
>>>>>> Would it be potentially better if that was Object's implementation?
>>>>>
>>>>> Define "better."
>>>>
>>>> Better in the sense that you would never HAVE to override hashCode.
>>>>
>>>> Now, there are cases where you HAVE to override it, or your code is
>>>> very broken.
>>>
>>> It is not broken.
>>>
>>> It will perform poorly in many cases.
>>
>> Well, I would go as far as to say that it will perform poorly in all
>> cases where hashCode() is actually needed
>
> More or less.
>
>> - and that makes it broken.
>
> This thread started about whether it is correct. The term correct
> has a very specific meaning in programming and that always return 1
> code is correct.
>
> Now you talk about broken.
Actually it wasn't me who brought up the term. :-)
> If broken means not good, then you are right.
> If broken means not correct, then you are wrong. I suspect
> that broken is not very well defined.
Right. :-)
>> The whole idea of hashing is based on the fact that the hash code
>> _somehow_ represents the item to be hashed. If all items have the same
>> constant hash code there is no point in hashing at all. So while it
>> does work, it does not work as intended.
>
> It disable the entire hashing functionality and a HashMap get
> characteristics of ArrayList.
An ArrayList allows multiple occurrences of the same instance - and does not
store key value pairs. That won't be the case with HashMap as equals() (if
properly implemented) will prevent that (albeit slowly, or more correct: slower
than with a proper implementation of hashCode()). Also, a HashMap will
degenerate more to a LinkedList via the chaining of a bucket's entries.
> But the code should actually work.
Yes, sort of - depending on whether O(1) is a requirement or not. Still, the
idea to implement hashCode() in Object in a way to return a constant to avoid
necessity of overriding it is crazy - especially since you then cannot
efficiently use Object as hash key - which you can today.
Kind regards
robert
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-fzq-this> |
|---|---|
| Date | 2012-08-20 18:58 +0000 |
| Message-ID | <50327C3B.57156.calajapr@time.synchro.net> |
| In reply to | #17836 |
To: Robert Klemme From: Arne Vajhoj <arne@vajhoej.dk> On 8/13/2012 1:17 PM, Robert Klemme wrote: > On 12.08.2012 22:59, Arne Vajhoj wrote: >> On 8/12/2012 11:06 AM, Robert Klemme wrote: >>> The whole idea of hashing is based on the fact that the hash code >>> _somehow_ represents the item to be hashed. If all items have the same >>> constant hash code there is no point in hashing at all. So while it >>> does work, it does not work as intended. >> >> It disable the entire hashing functionality and a HashMap get >> characteristics of ArrayList. > > An ArrayList allows multiple occurrences of the same instance - and does > not store key value pairs. That won't be the case with HashMap as > equals() (if properly implemented) will prevent that (albeit slowly, or > more correct: slower than with a proper implementation of hashCode()). > Also, a HashMap will degenerate more to a LinkedList via the chaining of > a bucket's entries. I guess my statement was a bit misleading. ... get O(1) characteristics for getting data similar to various List implementation. Arne --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-fzq-this> |
|---|---|
| Date | 2012-08-20 18:58 +0000 |
| Message-ID | <50327C3B.57157.calajapr@time.synchro.net> |
| In reply to | #17836 |
To: Robert Klemme From: Arne Vajhoj <arne@vajhoej.dk> On 8/13/2012 1:17 PM, Robert Klemme wrote: > On 12.08.2012 22:59, Arne Vajhoj wrote: >> But the code should actually work. > > Yes, sort of - depending on whether O(1) is a requirement or not. That is a rare functional requirement. (common non-functional) > Still, > the idea to implement hashCode() in Object in a way to return a constant > to avoid necessity of overriding it is crazy - especially since you then > cannot efficiently use Object as hash key - which you can today. Actually the more I think about the question the more logical it sounds! We are looking at two alternatives: A) default functions properly but good performance requires an override B) default gives good performance but may need an override to function properly #A actually sounds more the Java way to me. Arne --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Patricia Shanahan" <patricia.shanahan@1:261/38.remove-m2z-this> |
|---|---|
| Date | 2012-08-12 18:58 +0000 |
| Message-ID | <5027F2CB.56701.calajapr@time.synchro.net> |
| In reply to | #17691 |
To: bob smith From: Patricia Shanahan <pats@acm.org> On 8/10/2012 3:22 PM, bob smith wrote: ... > Better in the sense that you would never HAVE to override hashCode. > > Now, there are cases where you HAVE to override it, or your code is very broken. > I've decided to go back to this message, because I feel the key issue is the circumstances under which hashCode would need to be overridden if Object's version returned a constant, compared to the current situation. If Object's hashCode returned a constant, in practice anyone using an object as a key in a hash structure would want it overridden with one that has at least some chance of using multiple buckets. Without that property, a HashMap is an over-complicated, space-wasting cousin of a linked list. The problem with this is that the programmer who knows that Widget instances are going to be used as HashMap keys does not necessarily control the Widget implementation. The programmer writing Widget has no idea whether it will ever be used as a HashMap key, and therefore no way of knowing whether it is safe, assuming Widget inherits the Object equals, to also inherit the Object hashCode. Now compare to the current situation. The programmer implementing Widget decides whether to inherit a superclass equals or to write a Widget-specific equals. That programmer can assume the superclass has a hashCode that would be effective for a HashMap key, and only has to override hashCode if they are overriding equals. In practice, it is a long time since I've written a hashCode manually. Generally, when I decide to override equals I tell Eclipse to generate an equals/hashCode pair based on the fields that control whether two instances are equal. Overriding hashCode is no additional work given that I would be telling Eclipse to generate an equals based on those fields anyway. To me, the current situation seems "better". Patricia --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Joshua Cranmer" <joshua.cranmer@1:261/38.remove-m2z-this> |
|---|---|
| Date | 2012-08-12 18:58 +0000 |
| Message-ID | <5027F2CB.56703.calajapr@time.synchro.net> |
| In reply to | #17691 |
To: bob smith From: Joshua Cranmer <Pidgeot18@verizon.invalid> On 8/10/2012 6:22 PM, bob smith wrote: > Better in the sense that you would never HAVE to override hashCode. > > Now, there are cases where you HAVE to override it, or your code is very broken. Returning a constant hash code is correct in the same sense that answering "yes" to the question "Can you tell me the correct way to do this?" would be--syntactically and semantically correct, but completely contrary to the actual intent of the question. The point of the hash code is to provide a cheap way to quickly distinguish inputs (in the sense that Pr(a.hashCode() == b.hashCode() and !a.equals(b)) should be as small as possible [1]). A constant-value hash completely negates the purpose of the hash code, and this renders the hashCode again completely unusable for anything that actually wants to use it. In the default case, a.hashCode() == b.hashCode() only when a == b (this definitely holds true with 32-bit machines and I'm pretty sure it still holds true with 64-bit machines, but I'd have to reverify the JVM source code to be certain). It is thus correct so long as identity equals is correct. It is also potentially correct in a limited set of cases where a.equals(b) and a != b. In all of these cases, it would not only be correct but also extremely useful, having pretty strong guarantees about the distribution of hash values. [1] Actually, for good performance, hash codes should go one step further and make slightly stronger guarantees about independence with respect to the size of the hash table. But I digress. -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Patricia Shanahan" <patricia.shanahan@1:261/38.remove-m2z-this> |
|---|---|
| Date | 2012-08-12 18:58 +0000 |
| Message-ID | <5027F2CB.56704.calajapr@time.synchro.net> |
| In reply to | #17758 |
To: Joshua Cranmer From: Patricia Shanahan <pats@acm.org> On 8/12/2012 9:23 AM, Joshua Cranmer wrote: > On 8/10/2012 6:22 PM, bob smith wrote: >> Better in the sense that you would never HAVE to override hashCode. >> >> Now, there are cases where you HAVE to override it, or your code is >> very broken. > > Returning a constant hash code is correct in the same sense that > answering "yes" to the question "Can you tell me the correct way to do > this?" would be--syntactically and semantically correct, but completely > contrary to the actual intent of the question. > > The point of the hash code is to provide a cheap way to quickly > distinguish inputs (in the sense that Pr(a.hashCode() == b.hashCode() > and !a.equals(b)) should be as small as possible [1]). A constant-value > hash completely negates the purpose of the hash code, and this renders > the hashCode again completely unusable for anything that actually wants > to use it. > > In the default case, a.hashCode() == b.hashCode() only when a == b (this > definitely holds true with 32-bit machines and I'm pretty sure it still > holds true with 64-bit machines, but I'd have to reverify the JVM source > code to be certain). It is thus correct so long as identity equals is > correct. It is also potentially correct in a limited set of cases where > a.equals(b) and a != b. In all of these cases, it would not only be > correct but also extremely useful, having pretty strong guarantees about > the distribution of hash values. > > [1] Actually, for good performance, hash codes should go one step > further and make slightly stronger guarantees about independence with > respect to the size of the hash table. But I digress. > I think there are two reasonably usable ways of handling this issue. One is the current arrangement, in which every class has a hashCode that is expected to be usable for selecting a hash table bucket. Keeping hashCode as an Object method but making it useless for bucket selection unless overridden would not be a good alternative. A more reasonable alternative would be to have hashCode as the only member of a HashKey interface that would be implemented by every class whose objects are intended to be suitable for use as has keys. Those objects that have a hashCode would still have to have a usable one, but some classes would not implement HashKey and not have a hashCode at all. Patricia --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Eric Sosman" <eric.sosman@1:261/38.remove-6fd-this> |
|---|---|
| Date | 2012-08-12 20:00 +0000 |
| Message-ID | <5027FE12.56709.calajapr@time.synchro.net> |
| In reply to | #17759 |
To: Patricia Shanahan
From: Eric Sosman <esosman@ieee-dot-org.invalid>
On 8/12/2012 12:40 PM, Patricia Shanahan wrote:
>[...]
> I think there are two reasonably usable ways of handling this issue. One
> is the current arrangement, in which every class has a hashCode that is
> expected to be usable for selecting a hash table bucket.
>
> Keeping hashCode as an Object method but making it useless for bucket
> selection unless overridden would not be a good alternative.
>
> A more reasonable alternative would be to have hashCode as the only
> member of a HashKey interface that would be implemented by every class
> whose objects are intended to be suitable for use as has keys. Those
> objects that have a hashCode would still have to have a usable one, but
> some classes would not implement HashKey and not have a hashCode at all.
Ugh. So if J. Random Programmer is too lazy or unimaginative to
write hashCode(), that means I can't use his class as a HashMap key, or even
put instances in a HashSet? Ugh, again.
(And, no: I don't think a HashCalculator interface along the lines
of Comparable would save the day.)
--
Eric Sosman
esosman@ieee-dot-org.invalid
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Patricia Shanahan" <patricia.shanahan@1:261/38.remove-6fd-this> |
|---|---|
| Date | 2012-08-12 20:00 +0000 |
| Message-ID | <5027FE12.56710.calajapr@time.synchro.net> |
| In reply to | #17761 |
To: Eric Sosman From: Patricia Shanahan <pats@acm.org> On 8/12/2012 10:59 AM, Eric Sosman wrote: > On 8/12/2012 12:40 PM, Patricia Shanahan wrote: >> [...] >> I think there are two reasonably usable ways of handling this issue. One >> is the current arrangement, in which every class has a hashCode that is >> expected to be usable for selecting a hash table bucket. >> >> Keeping hashCode as an Object method but making it useless for bucket >> selection unless overridden would not be a good alternative. >> >> A more reasonable alternative would be to have hashCode as the only >> member of a HashKey interface that would be implemented by every class >> whose objects are intended to be suitable for use as has keys. Those >> objects that have a hashCode would still have to have a usable one, but >> some classes would not implement HashKey and not have a hashCode at all. > > Ugh. So if J. Random Programmer is too lazy or unimaginative to > write hashCode(), that means I can't use his class as a HashMap key, > or even put instances in a HashSet? Ugh, again. > > (And, no: I don't think a HashCalculator interface along the lines > of Comparable would save the day.) > I'm not saying that it would be better than the current situation, just better than having hashCode implementations that appear to be there, but in practice must not be used for hash bucket selection. Patricia --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Arne Vajhøj" <������ høj@1:261/38.remove-nlb-this> |
|---|---|
| Date | 2012-08-13 18:36 +0000 |
| Message-ID | <502943B4.56759.calajapr@time.synchro.net> |
| In reply to | #17759 |
To: Patricia Shanahan From: =?UTF-8?B?QXJuZSBWYWpow7hq?= <arne@vajhoej.dk> On 8/12/2012 12:40 PM, Patricia Shanahan wrote: > Keeping hashCode as an Object method but making it useless for bucket > selection unless overridden would not be a good alternative. > > A more reasonable alternative would be to have hashCode as the only > member of a HashKey interface that would be implemented by every class > whose objects are intended to be suitable for use as has keys. Those > objects that have a hashCode would still have to have a usable one, but > some classes would not implement HashKey and not have a hashCode at all. That would avoid using the hash for classes where it does not make much sense. I like it. Given that there is a gazillion lines of code out there that stores Object, then it can not be changed. Arne --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Jan Burse" <jan.burse@1:261/38.remove-m2z-this> |
|---|---|
| Date | 2012-08-12 18:58 +0000 |
| Message-ID | <5027F2CB.56705.calajapr@time.synchro.net> |
| In reply to | #17758 |
To: Joshua Cranmer From: Jan Burse <janburse@fastmail.fm> Joshua Cranmer schrieb: > > In the default case, a.hashCode() == b.hashCode() only when a == b (this > definitely holds true with 32-bit machines and I'm pretty sure it still > holds true with 64-bit machines, but I'd have to reverify the JVM source > code to be certain). Not at all. Even not for 32-bit machines. Not only because the hashCode() usually uses less than 32 bits. But also for other reasons, namely the internal algorithm how the hash is produced (although the below bug doesn't reveal much internals): http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6321873 But in the bottom of the above bug you can find code that searches for a clash. It can take quite a while, but it is not excluded. This is what a sample run produced according to the bug: 62028120: java.lang.Object@9cab16 - java.lang.Object@9cab16 So the 62028120-th new Object produced the same hashcode. So --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Roedy Green" <roedy.green@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FCD.56632.calajapr@time.synchro.net> |
| In reply to | #17644 |
To: bob smith
From: Roedy Green <see_website@mindprod.com.invalid>
On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
<bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
who said :
> @Override
> public int hashCode() {
> return 1;
> }
that's about the worst possible hashCode function.
See http://mindprod.com/jgloss/hashcode.html for tips on how to write good
ones.
--
Roedy Green Canadian Mind Products http://mindprod.com A new scientific truth
does not triumph by convincing its opponents and making them see the light,
but rather because its opponents eventually die, and a new generation grows up
that is familiar with it.
~ Max Planck 1858-04-23 1947-10-04
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Lew" <lew@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FCD.56633.calajapr@time.synchro.net> |
| In reply to | #17685 |
To: Roedy Green
From: Lew <lewbloch@gmail.com>
Roedy Green wrote:
> bob smith wrote, quoted or indirectly quoted someone
> who said :
>
>> @Override
> > public int hashCode() {
> > return 1;
> > }
>
> that's about the worst possible hashCode function.
Normally that's correct, but it's conceivable that one might do it for some
hackish reason. In most situations where one might do such an override as this,
one would do better not to override hashCode().
> See http://mindprod.com/jgloss/hashcode.html for tips on how to write
> good ones.
The default of assembling it via the mix-in of attribute hash codes using the
Knuth constants is usually good enough.
h(object) = Sum(i rnn 0.. n-1) of ( attribute[i] * pow(31, n - 1 - i) );
or
public static int calculateHash(Foo arg) {
int h = 0;
for ( each attribute of Foo that contributes to 'equals()' )
{
h = 31 * h + attribute.hashCode();
}
return h;
}
http://en.wikipedia.org/wiki/Hash_function
http://en.wikipedia.org/wiki/Java_hashCode()
http://en.wikipedia.org/wiki/Java_hashCode()#Java
--
Lew
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Roedy Green" <roedy.green@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FD1.56651.calajapr@time.synchro.net> |
| In reply to | #17686 |
To: Lew From: Roedy Green <see_website@mindprod.com.invalid> On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> wrote, quoted or indirectly quoted someone who said : > h =3D 31 * h + attribute.hashCode(); > } In my essay I recommend XOR which is an inherentely faster operation than multiply. I wonder which actually works out better. If you had a large number of fields, the multiply effect could fall off the left hand end. It is the algorithm used for String which could have very long strings, so Sun must have thought of that. -- Roedy Green Canadian Mind Products http://mindprod.com A new scientific truth does not triumph by convincing its opponents and making them see the light, but rather because its opponents eventually die, and a new generation grows up that is familiar with it. ~ Max Planck 1858-04-23 1947-10-04 --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Joerg Meier" <joerg.meier@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FD2.56657.calajapr@time.synchro.net> |
| In reply to | #17703 |
To: Roedy Green
From: Joerg Meier <joergmmeier@arcor.de>
On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote:
> On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com>
> wrote, quoted or indirectly quoted someone who said :
>> h =3D 31 * h + attribute.hashCode();
>> }
> In my essay I recommend XOR which is an inherentely faster operation
> than multiply.
Hasn't that been wrong since about the invention of the 80386 processor family
? Pretty sure by now MUL and XOR both take one cycle and that's it.
Liebe Gruesse,
Joerg
--
Ich lese meine Emails nicht, replies to Email bleiben also leider ungelesen.
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Mike Winter" <mike.winter@1:261/38.remove-ya-this> |
|---|---|
| Date | 2012-08-11 19:19 +0000 |
| Message-ID | <5026AEE3.56663.calajapr@time.synchro.net> |
| In reply to | #17709 |
To: Joerg Meier From: Mike Winter <usenet@michael-winter.me.invalid> On 11/08/2012 17:25, Joerg Meier wrote: > On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote: >[...] >> In my essay I recommend XOR which is an inherentely faster operation >> than multiply. > > Hasn't that been wrong since about the invention of the 80386 processor > family ? Not that far back: the Pentium required 9-11 cycles to complete a MUL instruction compared to 1-3 for XOR (and the like), depending on operand locations and widths. > Pretty sure by now MUL and XOR both take one cycle and that's it. More-or-less, but the former is still slower for wider operands. However, your point is well-taken: it needn't be as much a concern in most cases. -- Mike Winter Replace ".invalid" with ".uk" to reply by e-mail. --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Peter Duniho" <peter.duniho@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FD3.56658.calajapr@time.synchro.net> |
| In reply to | #17703 |
To: Roedy Green From: Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> On Sat, 11 Aug 2012 04:54:09 -0700, Roedy Green wrote: > On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> > wrote, quoted or indirectly quoted someone who said : > >> h =3D 31 * h + attribute.hashCode(); >> } > In my essay I recommend XOR which is an inherentely faster operation > than multiply. I wonder which actually works out better. If you had a > large number of fields, the multiply effect could fall off the left > hand end. It is the algorithm used for String which could have very > long strings, so Sun must have thought of that. Lew's example is better than XOR. I think you should fix your essay. There are of course even better approaches if one knows something specific about the input data. But the "multiply previous by prime, add next value" is a reasonably standard, "pretty good" composition function. XOR a well-known problems in the context of hash codes: in many situations, large numbers of combinations of values wind up mapping to the same result. Consider a pair of integers: any time those values are identical, the result of XOR is 0. On the other hand, with the multiply-and-add approach, it's much less common for "typical" integer values to combine to a hash value of 0. (Integers are an important scenario, because a 32-bit integer naturally is its own hash value...there's no reason to try to mix the bits any further as is done with other data types, so they don't). Now (unless I've done my math wrong) if your component hash codes are actually well-distributed across the entire range of a 32-bit integer (i.e. are completely uncorrelated to each other and have entirely random distribution), the XOR should work fine. It's just that it's very susceptible to creating hash value collisions when the input values aren't themselves well-distributed and so as a general technique it's not nearly as good as multiply-and-add. As far as performance goes, even if the multiply-and-add costs more (and as Joerg points out, this may only be a 1-cycle vs 2-cycle difference anyway on that specific operation), your code should not be spending a lot of time computing hash values in the first place. If that's a significant bottleneck, there are better ways to improve performance than worrying about XOR vs multiply-and-add, especially since a poor hash function can hurt performance much worse than using the wrong math operation would. Pete --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Jan Burse" <jan.burse@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FD3.56659.calajapr@time.synchro.net> |
| In reply to | #17703 |
To: Roedy Green
From: Jan Burse <janburse@fastmail.fm>
Roedy Green schrieb:
> If you had a
> large number of fields, the multiply effect could fall off the left
> hand end.
Actually this does not happen, since you multiply with 31, which is 1+2+4+8+16.
So that:
a*31+b = a*16+a*8+a*4+a*2+a+b
So for a HashMap that uses an index = hash & (2^n - 1) (which is the same as
hash mod 2^n), the impact of a will be still seen, even when it occurs at the
very left hand side.
There is some Microsoft C# HashMap implementation which does not use mod 2^n,
but instead some primes. In case the implementation choses 31 as the designated
prime, all information but for the first field will be lost. But since mod 2^32
is also applied, this might not be completely true.
For 2^n I don't know exactly how the impact could be described. I guess in a
HashMap with index = hash mod 2^1 the hash amounts to a parity bit, since the
sum in a+b acts like an xor on the first right hand bit. But 2^n with n>1 the
31 multiplication is a little more crude.
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-m2z-this> |
|---|---|
| Date | 2012-08-12 18:58 +0000 |
| Message-ID | <5027F2C9.56694.calajapr@time.synchro.net> |
| In reply to | #17703 |
To: Roedy Green From: Arne Vajhoj <arne@vajhoej.dk> On 8/11/2012 7:54 AM, Roedy Green wrote: > On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> > wrote, quoted or indirectly quoted someone who said : > >> h =3D 31 * h + attribute.hashCode(); >> } > In my essay I recommend XOR which is an inherentely faster operation > than multiply. I wonder which actually works out better. Multiply. XOR has several problems: - many small values give small result - same values in different fields give same result - two identical values give result zero + all those I did not think of. > If you had a > large number of fields, the multiply effect could fall off the left > hand end. It is the algorithm used for String which could have very > long strings, so Sun must have thought of that. The multiply effect does not fall off the left with a value like 31 (it would with 32). Arne --- BBBS/Li6 v4.10 Dada-1 * Origin: Prism bbs (1:261/38) --- Synchronet 3.16a-Win32 NewsLink 1.98 Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "rossum" <rossum@1:261/38.remove-ya-this> |
|---|---|
| Date | 2012-08-11 19:19 +0000 |
| Message-ID | <5026AEE2.56662.calajapr@time.synchro.net> |
| In reply to | #17686 |
To: Lew
From: rossum <rossum48@coldmail.com>
On Fri, 10 Aug 2012 12:45:07 -0700 (PDT), Lew <lewbloch@gmail.com> wrote:
>public static int calculateHash(Foo arg) {
> int h = 0;
>
> for ( each attribute of Foo that contributes to 'equals()' )
> {
> h = 31 * h + attribute.hashCode();
> }
> return h;
>}
Bloch starts with:
int h = 17;
He says that works beter in cases where the first one or more
attribute.hashCode() values are zero, and hence will not register.
He suggessts any constant non-zero value.
rossum
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
| From | "Arne Vajhøj" <arne.vajhøj@1:261/38.remove-pjg-this> |
|---|---|
| Date | 2012-08-11 18:17 +0000 |
| Message-ID | <50269FD0.56645.calajapr@time.synchro.net> |
| In reply to | #17685 |
To: Roedy Green
From: Arne Vajhoj <arne@vajhoej.dk>
On 8/10/2012 3:17 PM, Roedy Green wrote:
> On Fri, 10 Aug 2012 08:47:12 -0700 (PDT), bob smith
> <bob@coolfone.comze.com> wrote, quoted or indirectly quoted someone
> who said :
>
>> @Override
>> public int hashCode() {
>> return 1;
>> }
>
> that's about the worst possible hashCode function.
Yes, but the posted asked "Is it always technically correct to ..." not whether
is was "best possible".
Arne
--- BBBS/Li6 v4.10 Dada-1
* Origin: Prism bbs (1:261/38)
--- Synchronet 3.16a-Win32 NewsLink 1.98
Time Warp of the Future BBS - telnet://time.synchro.net:24
[toc] | [prev] | [next] | [standalone]
Page 2 of 3 — ← Prev page 1 [2] 3 Next page →
Back to top | Article view | comp.lang.java.programmer
csiph-web