Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.42!gegeweb.eu!nntpfeed.proxad.net!proxad.net!feeder1-2.proxad.net!74.125.46.80.MISMATCH!postnews.google.com!glegroupsg2000goo.googlegroups.com!not-for-mail From: Lew Newsgroups: comp.lang.java.programmer Subject: Re: Bulk Array Element Allocation, is it faster? Date: Sun, 25 Sep 2011 15:43:16 -0700 (PDT) Organization: http://groups.google.com Lines: 99 Message-ID: <25084990.892.1316990596220.JavaMail.geo-discussion-forums@prfb12> References: <31815149.2253.1316975280430.JavaMail.geo-discussion-forums@prfp13> Reply-To: comp.lang.java.programmer@googlegroups.com NNTP-Posting-Host: 216.239.45.130 Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Trace: posting.google.com 1316990596 30442 127.0.0.1 (25 Sep 2011 22:43:16 GMT) X-Complaints-To: groups-abuse@google.com NNTP-Posting-Date: Sun, 25 Sep 2011 22:43:16 +0000 (UTC) In-Reply-To: Complaints-To: groups-abuse@google.com Injection-Info: glegroupsg2000goo.googlegroups.com; posting-host=216.239.45.130; posting-account=CP-lKQoAAAAGtB5diOuGlDQk0jIwmH0T User-Agent: G2/1.0 X-Google-Web-Client: true Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:8316 On Sunday, September 25, 2011 12:25:18 PM UTC-7, Jan Burse wrote: > Lew schrieb: > > How would you imagine the JIT would optimize this > > without taking semantics into account? Each 'Bla' > > is required to be independently GC-able, you know. >=20 > In my posts I mentioned two possible optimizations. > Optimization 1: Moveing the locking outside of the > loop, Optimization 2: Allocating at once n*X where That's not an optimization because things might need to happen in between l= oop cycles, so you can't "lock the heap" that long and call it an "optimiza= tion". Semantically an array of n references is an array of n separate references.= You can't count on the memory allocations to be contiguous, shouldn't hav= e to count on that, and you certainly can't count on each reference becomin= g eligible for GC at the same time. GC in turn moves things around anyway,= so any "benefit" of contiguous allocation will be gone quite quickly, if i= t ever existed in the first place. And the point you ignore is that the "benefit" of contiguous allocation is = questionable, let alone of your rather bizarre suggestion to move the "lock= ing of the heap" outside the loop. Each 'new' just bumps up the heap point= er, so it's not much faster to bump it once than n times, in the grand sche= me of things. Not when your tiny advantage is quickly eaten up by circumst= ances immediately afterward anyway. > X is the space for the Bla record, not the space > for the Bla point. >=20 But that's the thing that's semantically different! You can't "optimize" allocation of n references to also create a block of n= instances. The optimizer cannot know that that is the intended use of the= array. You can't "optimize" the allocation of n instances consecutively e= ither, because you have to know a whole lot about what those n constructors= will try to do, including possibly allocate heap themselves or except out = prematurely. For the few cases where block allocation *might* provide negl= igible speedup, it's not worth the analysis effort to determine that this o= ne time is one of those rare exceptions when allocation might be sped up en= ough for anyone to notice. As people keep pointing out in this conversation. > Since after allocation, we have the assignment > bla[i] =3D new Bla(), the object cannot escape as > long as bla does not escape, so there should be > not much problem with GC, but I am not 100% sure > what the problems with GC should be. What do you even mean by this? What do you mean by the object "escaping"? I don't know about "problems" with GC, but you cannot be sure that the indi= vidual instances pointed to by the 'bla[i]' pointers will be descoped at th= e same time. Ergo you cannot know ahead of time when they will be eligible= for GC. Instances that survive GC will be moved to contiguous memory bloc= ks, but not their original ones. Whatever negligible benefit you might hav= e gotten, but more likely did not, from contiguous allocation of the 'Bla' = instances will be wiped out. > But anyway, let speak the figures, and lets stop > musing too much. I did some testing and it shows > that the 64bit can do the "speed up" whereby the > 32bit cannot do it: >=20 > OS JDK Arch Bulk Lazy Delta % > Win 1.6 64bit 8'771 9'805 11.8% > Win 1.6 32bit 14'587 14'744 1.1% > Win 1.5 32bit 17'139 17'405 1.6% > Mac 1.6 64bit 11'003 12'363 12.4% > Unix 1.6 32bit 26'517 26'858 1.3% >=20 > Still this leaves open the question whether the > 64-bit JIT is clever or the 64-bit JVM is clever > or the 64-bit CPU is clever. >=20 > Definitely it seems that the 32-bit is not clever, > there we all see a small overhead for the lazy, > which I interpret as the additional checks, eventually > done repeatedly, for the lazy. >=20 >> The answer is "nothing", because the semantics >> of the operation are such that the current >> mechanism is already quite close to optimal. >> This is what people have been explaining to you, >> that you dismissed as irrelevant. >=20 > Where do you see "nothing" in the above figures? Everywhere. You have some questionable method of timing something that you= do not describe, let alone with any precision, using doubtful methodology = and suspect reasoning without reference to experimental error or confidence= analysis. Your numbers literally tell me nothing. --=20 Lew