Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.42!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!74.125.46.80.MISMATCH!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail
From: Lew <lewbloch@gmail.com>
Newsgroups: comp.lang.java.programmer
Subject: Re: Bulk Array Element Allocation, is it faster?
Date: Sun, 25 Sep 2011 15:43:16 -0700 (PDT)
Organization: http://groups.google.com
Lines: 99
Message-ID: <25084990.892.1316990596220.JavaMail.geo-discussion-forums@prfb12>
References: <j5lvf0$bhl$1@news.albasani.net> <ke2t77p38ktjf6bi8fvng7mo0a2a0cad8e@4ax.com> <j5mpq6$lra$1@news.albasani.net> <j5n5jo$r2r$1@dont-email.me> <j5nnv6$t23$1@news.albasani.net> <31815149.2253.1316975280430.JavaMail.geo-discussion-forums@prfp13> <j5nv6v$h44$1@news.albasani.net>
Reply-To: comp.lang.java.programmer@googlegroups.com
NNTP-Posting-Host: 216.239.45.130
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
X-Trace: posting.google.com 1316990596 30442 127.0.0.1 (25 Sep 2011 22:43:16 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Sun, 25 Sep 2011 22:43:16 +0000 (UTC)
In-Reply-To: <j5nv6v$h44$1@news.albasani.net>
Complaints-To: groups-abuse@google.com
Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=216.239.45.130; posting-account=CP-lKQoAAAAGtB5diOuGlDQk0jIwmH0T
User-Agent: G2/1.0
X-Google-Web-Client: true
Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:8316

On Sunday, September 25, 2011 12:25:18 PM UTC-7, Jan Burse wrote:
> Lew schrieb:
> > How would you imagine the JIT would optimize this
>  > without taking semantics into account?  Each 'Bla'
>  > is required to be independently GC-able, you know.
>=20
> In my posts I mentioned two possible optimizations.
> Optimization 1: Moveing the locking outside of the
> loop, Optimization 2: Allocating at once n*X where

That's not an optimization because things might need to happen in between l=
oop cycles, so you can't "lock the heap" that long and call it an "optimiza=
tion".

Semantically an array of n references is an array of n separate references.=
  You can't count on the memory allocations to be contiguous, shouldn't hav=
e to count on that, and you certainly can't count on each reference becomin=
g eligible for GC at the same time.  GC in turn moves things around anyway,=
 so any "benefit" of contiguous allocation will be gone quite quickly, if i=
t ever existed in the first place.

And the point you ignore is that the "benefit" of contiguous allocation is =
questionable, let alone of your rather bizarre suggestion to move the "lock=
ing of the heap" outside the loop.  Each 'new' just bumps up the heap point=
er, so it's not much faster to bump it once than n times, in the grand sche=
me of things.  Not when your tiny advantage is quickly eaten up by circumst=
ances immediately afterward anyway.


> X is the space for the Bla record, not the space
> for the Bla point.
>=20

But that's the thing that's semantically different!

You can't "optimize" allocation of n references to also create a block of n=
 instances.  The optimizer cannot know that that is the intended use of the=
 array.  You can't "optimize" the allocation of n instances consecutively e=
ither, because you have to know a whole lot about what those n constructors=
 will try to do, including possibly allocate heap themselves or except out =
prematurely.  For the few cases where block allocation *might* provide negl=
igible speedup, it's not worth the analysis effort to determine that this o=
ne time is one of those rare exceptions when allocation might be sped up en=
ough for anyone to notice.

As people keep pointing out in this conversation.

> Since after allocation, we have the assignment
> bla[i] =3D new Bla(), the object cannot escape as
> long as bla does not escape, so there should be
> not much problem with GC, but I am not 100% sure
> what the problems with GC should be.

What do you even mean by this?  What do you mean by the object "escaping"?

I don't know about "problems" with GC, but you cannot be sure that the indi=
vidual instances pointed to by the 'bla[i]' pointers will be descoped at th=
e same time.  Ergo you cannot know ahead of time when they will be eligible=
 for GC.  Instances that survive GC will be moved to contiguous memory bloc=
ks, but not their original ones.  Whatever negligible benefit you might hav=
e gotten, but more likely did not, from contiguous allocation of the 'Bla' =
instances will be wiped out.

> But anyway, let speak the figures, and lets stop
> musing too much. I did some testing and it shows
> that the 64bit can do the "speed up" whereby the
> 32bit cannot do it:
>=20
> OS	JDK	Arch	Bulk	Lazy	Delta %
> Win 	1.6 	64bit	8'771	9'805	11.8%
> Win 	1.6 	32bit	14'587	14'744	1.1%
> Win	1.5 	32bit	17'139	17'405	1.6%
> Mac	1.6	64bit	11'003	12'363	12.4%
> Unix	1.6	32bit	26'517	26'858	1.3%
>=20
> Still this leaves open the question whether the
> 64-bit JIT is clever or the 64-bit JVM is clever
> or the 64-bit CPU is clever.
>=20
> Definitely it seems that the 32-bit is not clever,
> there we all see a small overhead for the lazy,
> which I interpret as the additional checks, eventually
> done repeatedly, for the lazy.
>=20
>> The answer is "nothing", because the semantics
>> of the operation are such that the current
>> mechanism is already quite close to optimal.
>> This is what people have been explaining to you,
>> that you dismissed as irrelevant.
>=20
> Where do you see "nothing" in the above figures?

Everywhere.  You have some questionable method of timing something that you=
 do not describe, let alone with any precision, using doubtful methodology =
and suspect reasoning without reference to experimental error or confidence=
 analysis.  Your numbers literally tell me nothing.

--=20
Lew