Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.c > #386066
| From | Tim Rentsch <tr.17687@z991.linuxsc.com> |
|---|---|
| Newsgroups | comp.lang.c |
| Subject | Re: Good hash for pointers |
| Date | 2024-06-17 00:56 -0700 |
| Organization | A noiseless patient Spider |
| Message-ID | <86zfrkj93b.fsf@linuxsc.com> (permalink) |
| References | (11 earlier) <20240605005916.00001b33@yahoo.com> <86a5jzmle1.fsf@linuxsc.com> <20240605195905.00002484@yahoo.com> <86y17ilm4k.fsf@linuxsc.com> <20240606110009.00001096@yahoo.com> |
Michael S <already5chosen@yahoo.com> writes:
> On Wed, 05 Jun 2024 21:40:27 -0700
> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>
>> Michael S <already5chosen@yahoo.com> writes:
>>
>>> On Wed, 05 Jun 2024 08:58:46 -0700
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
>>>
>>>> I did get your own hash function out and put it into my little
>>>> test rig. There may be summary comments below.
>>>
>>> <snip>
>>>
>>> As you probably already paid attention, I like bottom lines.
>>> What is a bottom line here?
>>
>> The bottom line is both the multiply-by-a-large-prime hashes
>> should be avoided.
>
> I am not convinced.
Let me amend my earlier statement. A hash function that simply
multiplies its input by an integer constant should be avoided
if the goal is to produce a hash function of decent quaility
rather than one of mediocre quality.
> Of course, it is no good for Malcolm, because it can't be coded
> in what Malcolm erroneously call "ANSI C".
I don't know why you say that. C was an ANSI standard before it
was an ISO standard. Or is it that you think that the language
Malcolm is intent on using does not conform to C90/C89/ANSI C?
>>> Did you you encounter cases in which almost-bonita's-but-not-quite
>>> hash function performed badly?
>>
>> Yes. A clear example is using directly mapped hash table that
>> has a power of two elements, with inputs all from malloc( 48 )
>> calls. All keys map to one of only 1024 entries, in a hash
>> table that has 65536 slots.
>
> I don't see it.
> $ ./a 0x10000 0xC000 48
> ref: 49152 keys, 65536 slots, 34590 occupied 11295 collisions.
> Worst collision: 7
> uut: 49152 keys, 65536 slots, 35791 occupied 11872 collisions.
> Worst collision: 3
You have run a poor test. The point of quality evaluation
testing is to look for potential problems, not to disguise
them.
> It sounds like in your tests you don't use 'linear scaling to
> table size' for translation of hash value to table index.
> IMHO, you should.
Like I said in another response, the question of how hash tables
should make use of hash values is a separate topic. The point
of the tests I ran was to evaluate the quality of hash functions,
not to look at how hash tables should be coded.
Here are some statistics I gathered from a more comprehensive
set of tests for several hash functions. The tests look at
aggregates of measured values for all output bits (considered
eight bits at a time) as a function of all input bits (considered
sixteen bits at a time. The first line in each pair summarizes
the low values in each set, and the second line summarizes the
high values in each set, in each case showing the range, average,
and standard deviation. The first set, hash 0, uses the aes
code you posted. I think it's easy to see the quality get
worse going from top to bottom.
hash 0
186 .. 201 avg: 195.39 dev: 3.71
312 .. 339 avg: 322.31 dev: 5.85
hash 1
186 .. 203 avg: 195.48 dev: 4.01
310 .. 344 avg: 322.45 dev: 6.31
hash 2
160 .. 202 avg: 193.56 dev: 7.70
311 .. 348 avg: 322.22 dev: 7.91
hash 3
143 .. 202 avg: 185.52 dev: 10.15
311 .. 387 avg: 329.52 dev: 14.10
hash 4
0 .. 212 avg: 9.94 dev: 45.16
270 .. 65536 avg: 44546.56 dev: 28809.47
Back to comp.lang.c | Previous | Next — Previous in thread | Next in thread | Find similar
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-05 08:58 -0700
AES problem (was: Good hash for pointers) Michael S <already5chosen@yahoo.com> - 2024-06-05 19:51 +0300
Re: AES problem (was: Good hash for pointers) Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-05 21:24 -0700
Re: Good hash for pointers Michael S <already5chosen@yahoo.com> - 2024-06-05 19:59 +0300
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-05 21:40 -0700
Re: Good hash for pointers Michael S <already5chosen@yahoo.com> - 2024-06-06 11:00 +0300
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-17 00:56 -0700
Re: Good hash for pointers Michael S <already5chosen@yahoo.com> - 2024-06-17 12:39 +0300
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-18 10:47 -0700
Re: Good hash for pointers Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-18 15:17 -0700
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-18 16:17 -0700
Re: Good hash for pointers James Kuyper <jameskuyper@alumni.caltech.edu> - 2024-06-18 19:23 -0400
Re: Good hash for pointers Keith Thompson <Keith.S.Thompson+u@gmail.com> - 2024-06-18 17:17 -0700
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-23 11:23 -0700
Re: Good hash for pointers Michael S <already5chosen@yahoo.com> - 2024-06-06 13:35 +0300
Re: Good hash for pointers Tim Rentsch <tr.17687@z991.linuxsc.com> - 2024-06-16 04:38 -0700
Re: Good hash for pointers "Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> - 2024-06-16 12:34 -0700
csiph-web