Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #19937

Re: SortedMap question?

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!eternal-september.org!feeder.eternal-september.org!mx04.eternal-september.org!.POSTED!not-for-mail
From Eric Sosman <esosman@comcast-dot-net.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: SortedMap question?
Date Sun, 25 Nov 2012 09:02:29 -0500
Organization A noiseless patient Spider
Lines 54
Message-ID <k8t8dp$tv5$1@dont-email.me> (permalink)
References <k8rkud$i3o$1@dont-email.me> <q4qoyde15lpk$.13amx8nbfrj28.dlg@40tude.net>
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding 7bit
Injection-Date Sun, 25 Nov 2012 14:02:33 +0000 (UTC)
Injection-Info mx04.eternal-september.org; posting-host="ffb8f7085759b339c1002252b48331a4"; logging-data="30693"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+PBuAC01XpxLXeRp15dvvw"
User-Agent Mozilla/5.0 (Windows NT 5.1; rv:17.0) Gecko/17.0 Thunderbird/17.0
In-Reply-To <q4qoyde15lpk$.13amx8nbfrj28.dlg@40tude.net>
Cancel-Lock sha1:Rrn/Tq8r6yVsu5skbQZ7wpEImuY=
Xref csiph.com comp.lang.java.programmer:19937

Show key headers only | View raw


On 11/24/2012 6:57 PM, Joerg Meier wrote:
> On Sat, 24 Nov 2012 15:23:55 -0800, Knute Johnson wrote:
>
>> I have been operating under a mis-perception that a SortedMap would be
>> more quickly searched than an unsorted Map.  More specifically that the
>> cost would come when adding an element to the SortedMap.  I wrote a
>> little test program to measure the amount of time it took to look for
>> non-existent data in a Map and a SortedMap and it took somewhere between
>> twice and three times as long to look for the data in the SortedMap.
>
> Considering Map is an Interface, I question your claim that you measured
> its get(x) performance :p
>
> Assuming you used HashMap, as far as I recall, its get(x) performance is
> already at O(1) unless you mess up your hashcodes. Doesn't really go lower.

     "HashMap is O(1)" is sort of true, but sweeps a lot of dirt
under the rug.  Most obviously, there's the possibility of a bad
or unlucky hash distribution -- by no means a remote possibility,
as <http://www.cs.rice.edu/~scrosby/hash/CrosbyWallach_UsenixSec2003/>
demonstrates.  Even if the hash distribution is acceptably uniform,
there's also the matter of building the map in the first place: As
the insertions accumulate, HashMap may need to expand and re-insert
everything it already contains.  With N=n1+n2+...+nk mappings, there
could be n1+(n1+n2)+...+(n1+n2+...+nk) = k*n1+(k-1)*n2+...+1*nk
insertions altogether (a good up-front estimate of N can help here).

     Consider also the operation cost.  Searching in a TreeMap of
N mappings takes about lg(N) comparisons -- but with String keys
the comparisons will be of varying difficulties.  Comparisons near
the root will probably examine only the first few characters to
determine the result; only as the search approaches the leaves will
the Strings' tail characters come into play.  On a successful search
HashMap must iterate over the key String's entire length twice: Once
to compute the hashCode(), and again to verify that a match has in
fact been found (comparisons against non-matching keys in the same
bucket will be quick, like those near the root of a TreeMap).

     Finally, O() itself hides a lot of detail -- that is, after all,
its purpose.  Given the choice between O(N*N) Algorithm X and O(1)
Algorithm Y, which would you choose?  Might you change your mind
if you learned that X takes N*N nanoseconds while Y takes a constant
eighteen hours?  Might you change your mind yet again if you learned
that N was ten million?  O() alone is not enough to decide a question
of speed.

     Still and all: HashMap makes a good "default" choice, and is
"likely" to outperform TreeMap over a "reasonable range" of inputs.
Turn to TreeMap when the order of the keys is significant -- for
example, when  you need "closest match" for an absent key.

-- 
Eric Sosman
esosman@comcast-dot-net.invalid

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

SortedMap question? Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-24 15:23 -0800
  Re: SortedMap question? Joerg Meier <joergmmeier@arcor.de> - 2012-11-25 00:57 +0100
    Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 08:49 -0500
      Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 08:59 -0500
        Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 14:02 +0000
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:20 -0500
        Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:24 -0500
          Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 21:25 -0500
            Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 22:24 -0500
            Re: SortedMap question? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-11-26 15:45 -0800
              Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-26 20:29 -0500
    Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 09:02 -0500
      Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 17:47 +0000
        Re: SortedMap question? Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-25 16:25 -0800
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 19:36 -0500
            Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 22:58 -0500
          Re: SortedMap question? Lew <lewbloch@gmail.com> - 2012-11-25 16:53 -0800
          Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-26 02:20 +0000
      Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:20 -0500
        Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 14:44 -0500
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:18 -0500
            Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 17:35 -0500
              Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 19:58 -0500
              Re: SortedMap question? Jim Janney <jjanney@shell.xmission.com> - 2012-11-26 12:45 -0700
                Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-26 14:26 -0600
        Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 14:16 -0600
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:21 -0500
            Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 17:07 -0600
              Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 20:05 -0500
          Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-26 01:34 +0000
            Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 20:36 -0600
              Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-12-15 12:27 +0000
  Re: SortedMap question? Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 01:36 +0100
  Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 20:00 -0500
  Re: SortedMap question? markspace <-@.> - 2012-11-24 17:34 -0800
    Re: SortedMap question? Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 15:28 +0100
      Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 17:49 +0000
  Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 14:00 +0000

csiph-web