Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #19883

Re: optimsed HashMap

From Roedy Green <see_website@mindprod.com.invalid>
Newsgroups comp.lang.java.programmer
Subject Re: optimsed HashMap
Date 2012-11-23 22:42 -0800
Organization Canadian Mind Products
Message-ID <8ip0b8p7blu31eub502so8cus1h9so3m9s@4ax.com> (permalink)
References <8i70b8d0pm6ibk03ti4t2pv60jd0bctlcs@4ax.com> <k8p85p$hqr$1@dont-email.me>

Show all headers | View raw


On Fri, 23 Nov 2012 17:33:43 -0800, markspace <-@.> wrote, quoted or
indirectly quoted someone who said :

>I'm not sure what you are trying to say there.  You want the case where 
>you do not find something in a hash map to be optimized?  "Optimized" how?

>What do you mean "add to the list of words" and "freeze"?

The following is not the real problem, but it might more simply
illustrate what I am asking.

Think of an ordinary HashMap<String,String>

What it does is translate a few English words with French derivation,
putting the French accents on them. e.g. naive -> na&iuml;ve Napoleon
-> Napol&acute;on 

Let us say you have 100 such words you want to transform. (In my
actual problem I have about 1500  words).

You go through the files for a website looking at each word of text
(avoiding HTML markup) in the HashMap. If you find it you replace it.

Most of the time word you look up is not in the list.

This is a time-consuming process.  I would like to speed it up.

My lookup has two properties that might be exploited in some variant
HashMap.

1. nearly always the lookup fails. The code should be optimised for
this case.  If it has some fast way of knowing the elt is not there,
it should do that first.

2. the list of words to lookup does not change after initial
preparation. I can afford to do some special calculation to prime the
lookup. For example, I once heard of  some tweaking to avoid long
collision chains for a C implementation of HashMap.

My question had two purposes.  To see if there was something available
off the shelf, and to stimulate thought on some new algorithm that
could have wider application that just my problem.

Another way of looking at the problem is it would be nice to have a
HashSet implementation that was considerably faster than a HashMap.
IIRC, currently HashSet is implemented as a HashMap.

Such an algorithm could be used to fix your most common spelling
mistakes, to add links to magic words, to add markup to magic words
to find and report the presence of certain words, or in my case find
acronyms and replace them with a macro for that acronym that displays
the meaning of the acronym the first time it is used on a page.
-- 
Roedy Green Canadian Mind Products http://mindprod.com
Students who hire or con others to do their homework are as foolish 
as couch potatoes who hire others to go to the gym for them. 

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

optimsed HashMap Roedy Green <see_website@mindprod.com.invalid> - 2012-11-23 17:12 -0800
  Re: optimsed HashMap Arne Vajhøj <arne@vajhoej.dk> - 2012-11-23 20:19 -0500
  Re: optimsed HashMap markspace <-@.> - 2012-11-23 17:33 -0800
    Re: optimsed HashMap Roedy Green <see_website@mindprod.com.invalid> - 2012-11-23 22:42 -0800
      Re: optimsed HashMap Roedy Green <see_website@mindprod.com.invalid> - 2012-11-24 03:34 -0800
        Re: optimsed HashMap Knute Johnson <nospam@knutejohnson.com> - 2012-11-24 08:39 -0800
          Re: optimsed HashMap Knute Johnson <nospam@rabbitbrush.frazmtn.com> - 2012-11-24 15:14 -0800
      Re: optimsed HashMap Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 13:24 -0500
      Re: optimsed HashMap markspace <-@.> - 2012-11-24 10:44 -0800
        Re: optimsed HashMap "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 13:40 +0000
      Re: optimsed HashMap Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-11-26 22:03 +0100
        Re: optimsed HashMap Robert Klemme <shortcutter@googlemail.com> - 2012-11-26 23:32 +0100
          Re: optimsed HashMap Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-11-27 03:24 +0100
            Re: optimsed HashMap Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-11-27 03:35 +0100
              Re: optimsed HashMap Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-27 08:44 -0500
                Re: optimsed HashMap Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-11-27 14:20 -0800
                Re: optimsed HashMap Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2012-11-30 03:35 +0100
  Re: optimsed HashMap Patricia Shanahan <pats@acm.org> - 2012-11-23 19:51 -0800
    Re: optimsed HashMap "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-24 10:21 +0000
      Re: optimsed HashMap Roedy Green <see_website@mindprod.com.invalid> - 2012-11-24 03:39 -0800
        Re: optimsed HashMap Robert Klemme <shortcutter@googlemail.com> - 2012-11-24 16:24 +0100
          Re: optimsed HashMap "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-25 13:50 +0000
            Re: optimsed HashMap Robert Klemme <shortcutter@googlemail.com> - 2012-11-25 15:30 +0100
      Re: optimsed HashMap "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-11-26 21:13 +0000
    Re: optimsed HashMap Arne Vajhøj <arne@vajhoej.dk> - 2012-11-24 13:16 -0500
  Re: optimsed HashMap v_borchert@despammed.com (Volker Borchert) - 2012-11-24 08:05 +0000
  Re: optimsed HashMap Silvio <silvio@internet.com> - 2012-11-26 11:57 +0100
  Re: optimsed HashMap Jim Janney <jjanney@shell.xmission.com> - 2012-11-26 11:13 -0700
  Re: optimsed HashMap Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-11-26 15:44 -0800
    Re: optimsed HashMap Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-26 20:28 -0500
      Re: optimsed HashMap Arved Sandstrom <asandstrom2@eastlink.ca> - 2012-11-27 06:01 -0400
        Re: optimsed HashMap Eric Sosman <esosman@comcast-dot-net.invalid> - 2012-11-27 08:56 -0500
      Re: optimsed HashMap Daniel Pitts <newsgroup.nospam@virtualinfinity.net> - 2012-11-27 14:16 -0800

csiph-web