Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #19944

Re: SortedMap question?

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder4.news.weretis.net!usenet.ukfsn.org!not-for-mail
From Martin Gregorie <martin@address-in-sig.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: SortedMap question?
Date Sun, 25 Nov 2012 17:47:31 +0000 (UTC)
Organization UK Free Software Network
Lines 48
Message-ID <k8tljj$6jr$1@localhost.localdomain> (permalink)
References <k8rkud$i3o$1@dont-email.me> <q4qoyde15lpk$.13amx8nbfrj28.dlg@40tude.net> <k8t8dp$tv5$1@dont-email.me>
NNTP-Posting-Host 84.45.235.129
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 8bit
X-Trace localhost.localdomain 1353865651 6779 84.45.235.129 (25 Nov 2012 17:47:31 GMT)
X-Complaints-To usenet@localhost.localdomain
NNTP-Posting-Date Sun, 25 Nov 2012 17:47:31 +0000 (UTC)
User-Agent Pan/0.139 (Sexual Chocolate; GIT bf56508 git://git.gnome.org/pan2)
Xref csiph.com comp.lang.java.programmer:19944

Show key headers only | View raw


On Sun, 25 Nov 2012 09:02:29 -0500, Eric Sosman wrote:

> 
>      Still and all: HashMap makes a good "default" choice, and is
> "likely" to outperform TreeMap over a "reasonable range" of inputs. Turn
> to TreeMap when the order of the keys is significant -- for example,
> when  you need "closest match" for an absent key.
>
Or, if the key comparison is cheap, i.e. short strings or integer 
comparisons, and the key population is very large TreeMap can become 
attractive: with 8 million nodes you only need 23 comparisions to find a 
miss. Growing the tree isn't too expensive either: rebalancing a balanced 
red-black tree, which is necessary from time to time as it grows, is only 
a matter of adjusting pointers. 

There is one possible demerit that I found in what was the standard C 
implementation a while back: adding a node was very expensive because it 
required three mallocs: there were separate bits of heap storage for the 
pointers (left, right and parent), the key storage and the data storage.

On the box we were running at the time (a Dec Alphaserver running DEC 
Unix) the best I could get in a situation where each key was looked up 
and then added if it wasn't in the tree was around 700 lookups a second 
on a small dev box: not nearly enough. First I threw out the library 
code, combined pointers, key and data into one struct and rewrote the red-
black tree handler using Sedgewick. This pushed the rate up to 1500 and 
proved that malloc was the bottleneck, but it was still too slow for 
projected data volumes, so I moved memory management into user land by 
only mallocing for several MB at a time and allocating bits of these 
large chunks to the nodes. This rocked: the lookup rate immediately 
jumped to about 35,000 a second and, as a bonus, because all my pointers 
were relative to the big memory chunks and their places in a chunk table, 
it was possible to bring the live system's start-up time down to 2-3 
minutes by saving and reloading the chunks as files: rebuilding the red-
black tree from a database had been taking 40 minutes.

Yes, I know: all this is C and the JVM should be much better at heap 
management because 'top' shows that it only grabs and releases memory in 
large chunks which minimises the malloc/sbrk overheads. However, it does 
show that serious runtime optimisation may require you to know what's 
under the hood of library code and to realise that this code may have 
sacrificed speed on the altar of flexibility.
 

-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

SortedMap question? Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-24 15:23 -0800
  Re: SortedMap question? Joerg Meier <joergmmeier@arcor.de> - 2012-11-25 00:57 +0100
    Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 08:49 -0500
      Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 08:59 -0500
        Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 14:02 +0000
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:20 -0500
        Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:24 -0500
          Re: SortedMap question? David Lamb <dalamb@cs.queensu.ca> - 2012-11-25 21:25 -0500
            Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 22:24 -0500
            Re: SortedMap question? Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-11-26 15:45 -0800
              Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-26 20:29 -0500
    Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 09:02 -0500
      Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 17:47 +0000
        Re: SortedMap question? Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-25 16:25 -0800
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 19:36 -0500
            Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 22:58 -0500
          Re: SortedMap question? Lew <lewbloch@gmail.com> - 2012-11-25 16:53 -0800
          Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-26 02:20 +0000
      Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 14:20 -0500
        Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 14:44 -0500
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:18 -0500
            Re: SortedMap question? Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-25 17:35 -0500
              Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 19:58 -0500
              Re: SortedMap question? Jim Janney <jjanney@shell.xmission.com> - 2012-11-26 12:45 -0700
                Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-26 14:26 -0600
        Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 14:16 -0600
          Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 16:21 -0500
            Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 17:07 -0600
              Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-25 20:05 -0500
          Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-26 01:34 +0000
            Re: SortedMap question? Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-11-25 20:36 -0600
              Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-12-15 12:27 +0000
  Re: SortedMap question? Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 01:36 +0100
  Re: SortedMap question? Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 20:00 -0500
  Re: SortedMap question? markspace <-@.> - 2012-11-24 17:34 -0800
    Re: SortedMap question? Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 15:28 +0100
      Re: SortedMap question? Martin Gregorie <martin@address-in-sig.invalid> - 2012-11-25 17:49 +0000
  Re: SortedMap question? "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 14:00 +0000

csiph-web