Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #19978

Re: SortedMap question?

From Eric Sosman <esosman@comcast-dot-net.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: SortedMap question?
Date 2012-11-25 22:58 -0500
Organization A noiseless patient Spider
Message-ID <k8updu$f62$1@dont-email.me> (permalink)
References (1 earlier) <q4qoyde15lpk$.13amx8nbfrj28.dlg@40tude.net> <k8t8dp$tv5$1@dont-email.me> <k8tljj$6jr$1@localhost.localdomain> <k8ucsh$o2v$1@dont-email.me> <50b2b971$0$294$14726298@news.sunsite.dk>

Show all headers | View raw


On 11/25/2012 7:36 PM, Arne Vajhøj wrote:
> On 11/25/2012 7:25 PM, Knute Johnson wrote:
>> On 11/25/2012 09:47, Martin Gregorie wrote:
>>> Or, if the key comparison is cheap, i.e. short strings or integer
>>> comparisons, and the key population is very large TreeMap can become
>>> attractive: with 8 million nodes you only need 23 comparisions to find a
>>> miss. Growing the tree isn't too expensive either: rebalancing a
>>> balanced
>>> red-black tree, which is necessary from time to time as it grows, is
>>> only
>>> a matter of adjusting pointers.
>>
>> That's what I thought, that the searching for a match would be very
>> quick once the data was in the TreeMap.  The test code I wrote may not
>> have been any good.  I created a map with Strings for the keys and
>> values.  I could get about 2 million entries before I started running
>> into virtual memory issues slowing things down.  I searched for
>> non-existent elements.  Using both a HashMap and a TreeMap, the TreeMap
>> was significantly slower than the HashMap.  I even tried a
>> ConcurrentHashMap and multiple threads to do the search to see if that
>> would speed things up.  While that technique was better than TreeMap it
>> was still slower than a plain HashMap.
>>
>> So for the actual case that I am working on, I have a collection of
>> about 5000 elements and am using an Integer as the key.  Data is rarely
>> changed but often accessed.  There should never be a case where the
>> searched for key will not exist.  What would you use, a HashMap or a
>> TreeMap?
>
> If it is important then you should measure!

     Aaay-men!

> But I would still expect HashMap to be faster than TreeMap.

     Yes.  A HashMap with 5000 keys and the default 0.75 load
factor will want 5000/0.75 ~= 6667 buckets, which it will round
up to 8192.  With normal luck the 5000 keys will land in ~3743
buckets, about two-thirds of which will hold just one key.  So
a typical search involves a few arithmetic operations to locate
the bucket, then about 5000/3743 ~= 1.3 Integer#equals() calls.

     Searching a TreeMap of 5000 keys, on the other hand, will
use about lg(5000) ~= 11.3 Integer#compareTo() calls.  Uses of
equals() and compareTo() have different costs (equals() needs to
work with Objects and check their classes before comparing, but
compareTo() needs to produce a three-way answer rather than a
simpler boolean ...), but any differences are almost certainly
swamped by the disparity in call counts.

     If the key values are not too widely scattered, other
possibilities exist.  A BitSet can answer "present or absent"
queries very quickly and without taking much space, or a simple
array can serve as a Map stand-in.  But follow Arne's advice
when making your choice: Measure It!

-- 
Eric Sosman
esosman@comcast-dot-net.invalid

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

SortedMap question? Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-24 15:23 -0800
  Re: SortedMap question? Joerg Meier <joergmmeier@arcor.de> - 2012-11-25 00:57 +0100
    Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 08:49 -0500
      Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 08:59 -0500
        Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 14:02 +0000
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:20 -0500
        Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:24 -0500
          Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 21:25 -0500
            Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 22:24 -0500
            Re: SortedMap question? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-11-26 15:45 -0800
              Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-26 20:29 -0500
    Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 09:02 -0500
      Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 17:47 +0000
        Re: SortedMap question? Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-25 16:25 -0800
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 19:36 -0500
            Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 22:58 -0500
          Re: SortedMap question? Lew <lewbloch@gmail.com> - 2012-11-25 16:53 -0800
          Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-26 02:20 +0000
      Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:20 -0500
        Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 14:44 -0500
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:18 -0500
            Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 17:35 -0500
              Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 19:58 -0500
              Re: SortedMap question? Jim Janney <jjanney@shell.xmission.com> - 2012-11-26 12:45 -0700
                Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-26 14:26 -0600
        Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 14:16 -0600
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:21 -0500
            Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 17:07 -0600
              Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 20:05 -0500
          Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-26 01:34 +0000
            Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 20:36 -0600
              Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-12-15 12:27 +0000
  Re: SortedMap question? Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 01:36 +0100
  Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 20:00 -0500
  Re: SortedMap question? markspace <-@.> - 2012-11-24 17:34 -0800
    Re: SortedMap question? Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 15:28 +0100
      Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 17:49 +0000
  Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 14:00 +0000

csiph-web