Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!feeder.erje.net!eu.feeder.erje.net!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Robert Klemme Newsgroups: comp.lang.java.programmer Subject: Re: optimsed HashMap Date: Mon, 26 Nov 2012 23:32:57 +0100 Lines: 24 Message-ID: References: <8i70b8d0pm6ibk03ti4t2pv60jd0bctlcs@4ax.com> <8ip0b8p7blu31eub502so8cus1h9so3m9s@4ax.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: individual.net 1W60VuPaRN4iD9pKkoir4ACcX/Pg4BP77IXeoUGnUBKnuVFfVJpS3Wo2pAIRxMjLk= Cancel-Lock: sha1:44jqR8hfAMHc4b6z6jXn01KTMfA= User-Agent: Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/17.0 Thunderbird/17.0 In-Reply-To: Xref: csiph.com comp.lang.java.programmer:19987 On 11/26/2012 10:03 PM, Daniele Futtorovic wrote: > On 24/11/2012 07:42, Roedy Green allegedly wrote: >> You go through the files for a website looking at each word of text >> (avoiding HTML markup) in the HashMap. If you find it you replace it. >> >> Most of the time word you look up is not in the list. >> >> This is a time-consuming process. I would like to speed it up. > > You might want to intern() the input to avoid having to recompute the > hash every time (if applicable). Other than that, you'll either be > wanting a better hashing algorithm, to avoid collisions, or indeed > something altogether fancier (but riskier in terms or RoI). How would interning help? The input is read only once anyway and if you mean to intern individual words of the input then how does the JVM do the interning? My guess would be that some form of hashing would be used there as well - plus that internal data structure must be thread safe... Kind regards robert