Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #6309

Re: HashSet keeps all nonidentical equal objects in memory

Path csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!news.glorb.com!postnews.google.com!x10g2000vbl.googlegroups.com!not-for-mail
From Robert Klemme <shortcutter@googlemail.com>
Newsgroups comp.lang.java.programmer
Subject Re: HashSet keeps all nonidentical equal objects in memory
Date Wed, 20 Jul 2011 08:38:49 -0700 (PDT)
Organization http://groups.google.com
Lines 42
Message-ID <c8b56e6e-b04f-4831-b6ab-712b10402a50@x10g2000vbl.googlegroups.com> (permalink)
References <2f8556b7-4d08-4adb-a455-7997fcff0829@m10g2000yqd.googlegroups.com>
NNTP-Posting-Host 193.0.246.21
Mime-Version 1.0
Content-Type text/plain; charset=ISO-8859-1
X-Trace posting.google.com 1311176443 20732 127.0.0.1 (20 Jul 2011 15:40:43 GMT)
X-Complaints-To groups-abuse@google.com
NNTP-Posting-Date Wed, 20 Jul 2011 15:40:43 +0000 (UTC)
Complaints-To groups-abuse@google.com
Injection-Info x10g2000vbl.googlegroups.com; posting-host=193.0.246.21; posting-account=MGO7qgoAAABvyo26eHVDO00044spH-ws
User-Agent G2/1.0
X-HTTP-Via 1.1 webwasher (Webwasher 6.8.7.9396)
X-Google-Web-Client true
X-Google-Header-Order ASELNKCHRUV
X-HTTP-UserAgent Mozilla/5.0 (Windows NT 5.1; rv:5.0.1) Gecko/20100101 Firefox/5.0.1,gzip(gfe)
Xref x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:6309

Show key headers only | View raw


On 20 Jul., 11:43, Frederik <landcglo...@gmail.com> wrote:
> I've been doing java programming for over 10 years, but now I've
> encoutered a phenomenon that I wasn't aware of at all.

Apparently you didn't - as you found out in the meantime. :-)

> I had an application in which I have a HashSet<String>. I added a lot
> of different String objects to this HashSet, but many of the String
> objects are equal to each other. Now, after a while my application ran
> out of memory, even with -Xmx1500M. This happened when there were only
> about 7000 different Strings in the set! I didn't understand this,
> until I started adding the "intern()" of every String object to the
> set instead of the original String object. Now the program needs
> virtually no memory anymore.
> There is only one explanation: before I used "intern()", ALL the
> different String objects, even the ones that are equal, were kept in
> memory by the HashSet! No matter how strange it sounds. I was
> wondering, does anybody have an explanation as to why this is the case?

No, that conclusion is not warranted by the facts.  You only know that
*something* kept hold of a lot of memory (String instances).  Since we
do neither know all the code nor do we know the application
architecture we can only speculate but it seems a realistic assumption
that those String instances are not only kept by the HashSet but
somewhere else.

An easy way you can create such a situation is that you are reading
from some external source (file) repeated content and create an object
which - among other things - holds the String.  Now you have 1,000,000
objects holding on to 1,000,000 String instances but there are only
7,000 different character sequences.  In such a situation it may be
better to have a HashMap<String,String> where you store the String
only once and reuse that first instance.  Basically this is what
happened when you used String.intern() only that you do not have
control over this storage any more which - depending on application
type - can still create a serious memory leak, e.g. long running app
which over time reads multiple files with different sets of repeated
strings.

Kind regards

robert

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

HashSet keeps all nonidentical equal objects in memory Frederik <landcglobal@gmail.com> - 2011-07-20 02:43 -0700
  Re: HashSet keeps all nonidentical equal objects in memory Eric Sosman <esosman@ieee-dot-org.invalid> - 2011-07-20 07:30 -0400
  Re: HashSet keeps all nonidentical equal objects in memory Frederik <landcglobal@gmail.com> - 2011-07-20 04:09 -0700
    Re: HashSet keeps all nonidentical equal objects in memory markspace <-@.> - 2011-07-20 08:22 -0700
  Re: HashSet keeps all nonidentical equal objects in memory Robert Klemme <shortcutter@googlemail.com> - 2011-07-20 08:38 -0700
    Re: HashSet keeps all nonidentical equal objects in memory lewbloch <lewbloch@gmail.com> - 2011-07-20 09:31 -0700

csiph-web