Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #12477

Re: Additional logging questions

From Lew <noone@lewscanon.com>
Newsgroups comp.lang.java.programmer
Subject Re: Additional logging questions
Date 2012-02-27 23:51 -0800
Organization albasani.net
Message-ID <jii126$df5$1@news.albasani.net> (permalink)
References (3 earlier) <jif6ua$3cm$1@news.albasani.net> <XnsA0069221E7830jpnasty@94.75.214.39> <jigr8v$jhh$1@news.albasani.net> <XnsA006D3D7FBB0Bjpnasty@94.75.214.39> <4f4c348e$0$292$14726298@news.sunsite.dk>

Show all headers | View raw


On 02/27/2012 05:57 PM, Arne Vajhøj wrote:
> On 2/27/2012 8:48 PM, Novice wrote:
>> Lew<noone@lewscanon.com> wrote in news:jigr8v$jhh$1@news.albasani.net:
>>> But wait! 'hashCode()', which is a shortcut for 'equals()' (well,
>>> really for "not equals()"), doesn't know anything about those fields
>>> yet! It will think objects are not equal when they really are.
>>>
>>> So you have to override 'hashCode()', too, using the same fields, so
>>> that its result are consistent with 'equals()':
>>>
>>> @Override public int hashCode()
>>> {
>>> return owner.hashCode() * 31 + color.hashCode();
>>> }
>>>
>> Just curious: why multiply the owner.hashCode() by 31? There's no
>> significance to the 31 is there? Why not just add the two hashCodes
>> together without multiplying one of them first? That would be just as
>> good, right?
>
> Actually not.
>
> Some numbers are better than others (no multiply means 1).
>
> You should go for a prime.
>
> 31 is often used.
>
> But there are other good numbers.

Donald Knuth, _The Art of Computer Programming_, suggested 31 for modulo 32 
hashes because it produces good distribution on the average for random inputs.

There are better hashes, but for most purposes the one illustrated suffices.

More generally (pseudocode):

@Override int hashCode()
{
  int hash = 17; // or 0 or some other decent starter
  for (Object field : instanceFields)
  {
    hash = hash * 31;
    hash += field.hashCode();
  }
  return hash;
}

I skipped over niceties like whether to use 0 for null or empty element hashes.

The Javadocs explain some of this, but the purpose of 'hashCode()' mostly is 
to quickly tell if instances are not equal. Equal instances must have the same 
hash - that's a hard rule. But the converse cannot apply - many instances will 
have the same hash even if they're not equal.

The trick is not to have many all at once during the run of your application.

Then probability is your friend. With 2^^32 possible hashcodes, let's say for 
a map of 'String' to 'Foo', you have a lot of room for many strings before you 
get too many collisions. That means you need a hash that is about equally 
likely to pick any number as any other, so that for any given set of actual 
values at a time you probably will have different hashes for different objects.

In practice you get a few collisions, but as long as "a few" is within 
acceptable limits you're good to go. Otherwise you have to play advanced 
reindeer games with hash algorithms.

Why all this hash fooferol? Because once you have a hash value lookups are so 
much faster on 'int' comparisons than between arbitrary 'String's of varying 
length. So let's say you're trying to find an element in a 'Set' - you look 
for the hashcode first, and if you find zero or one element you're done. 
Otherwise you distinguish between the colliding elements with 'equals()', but 
hopefully only for two or at worst three elements.

Hashes are also useful in cryptography.

It's an endless topic, full of fun. You need never be bored of a winter's eve. 
I'd actually start with Wikipedia on this one.

Aside: I seem to recall an 'equals()' method builder in Apache Commons Lang 
somewhere. It's got some sort of indicative name like 'EqualsBuilder'.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Additional logging questions Novice <novice@example..com> - 2012-02-27 02:02 +0000
  Re: Additional logging questions Arne Vajhøj <arne@vajhoej.dk> - 2012-02-26 21:08 -0500
    Re: Additional logging questions Novice <novice@example..com> - 2012-02-27 04:12 +0000
      Re: Additional logging questions Lew <noone@lewscanon.com> - 2012-02-26 22:13 -0800
        Re: Additional logging questions Novice <novice@example..com> - 2012-02-27 19:14 +0000
          Re: Additional logging questions Lew <noone@lewscanon.com> - 2012-02-27 12:32 -0800
          Re: Additional logging questions Lew <noone@lewscanon.com> - 2012-02-27 13:06 -0800
            Re: Additional logging questions Novice <novice@example..com> - 2012-02-28 01:48 +0000
              Re: Additional logging questions Arne Vajhøj <arne@vajhoej.dk> - 2012-02-27 20:57 -0500
                Re: Additional logging questions Lew <noone@lewscanon.com> - 2012-02-27 23:51 -0800
          Re: Additional logging questions Arne Vajhøj <arne@vajhoej.dk> - 2012-02-27 20:50 -0500
            Re: Additional logging questions Lew <noone@lewscanon.com> - 2012-02-27 23:57 -0800
        Re: Additional logging questions Arne Vajhøj <arne@vajhoej.dk> - 2012-02-27 20:45 -0500
          Re: Additional logging questions Leif Roar Moldskred <leifm@dimnakorr.com> - 2012-02-28 01:52 -0600
            Re: Additional logging questions Arne Vajhøj <arne@vajhoej.dk> - 2012-02-28 17:20 -0500
            Re: Additional logging questions Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2012-02-28 20:26 -0400
      Re: Additional logging questions Arne Vajhøj <arne@vajhoej.dk> - 2012-02-27 20:34 -0500
  Re: Additional logging questions Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-02-27 12:36 -0800
    Re: Additional logging questions Novice <novice@example..com> - 2012-02-28 01:43 +0000

csiph-web