Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!Xl.tags.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!news.giganews.com.POSTED!not-for-mail NNTP-Posting-Date: Mon, 06 Feb 2012 20:28:20 -0600 Message-ID: <4F308C3F.3010101@SPAM.comp-arch.net> Date: Mon, 06 Feb 2012 18:28:15 -0800 From: "Andy (Super) Glew" Reply-To: andy@SPAM.comp-arch.net Organization: comp-arch.net User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 MIME-Version: 1.0 Newsgroups: comp.arch Subject: Re: Single Thread Performance References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Lines: 126 X-Usenet-Provider: http://www.giganews.com X-Trace: sv3-eROruWiPDbDCs6VEnD4gZnrbCOTjK/RfI+GnvMZpIOsPNthAfnElMV6hyk90CvZlZJPoc1PTPwX5mIK!yKmLRs/2yk3aj0L022wZNkYtotGWRZvuhk+Ds1J5CBgbKKihrBazWHHeFW0uRz4= X-Complaints-To: abuse@giganews.com X-DMCA-Notifications: http://www.giganews.com/info/dmca.html X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 6509 Xref: x330-a1.tempe.blueboxinc.net comp.arch:5753 On 2/6/2012 12:42 PM, nmm1@cam.ac.uk wrote: > In article, > Robert Myers wrote: >> On 2/6/2012 8:06 AM, nmm1@cam.ac.uk wrote: >>> In article, >>> Brett Davis wrote: >>>> >>>> Intel has built a 40 billion dollar a year company with 10 billion >>>> in profits a year out of pursuing Single Thread Performance. >>> >>> Well, perhaps. Sort of. >>> >>> Something that you miss is that even Intel has started to realise >>> that it has hit an immovable obstacle, and to pursue multi-thread >>> performance. But the message has to travel from the ground to >>> the brain and then back to where the action is, and Intel is as >>> much a Diplodocus as IBM was in the 1970s. Things don't happen >>> fast. >> >> I'm fairly certain (in fact, absolutely certain) that there are people >> who believe that the obstacle is not immovable. Just how far it can be >> moved and at what cost is another matter. > > There are people who believe that the earth is flat, too. The > fact that the obstacle has an exponential cliff and not a strictly > asymptotic one doesn't mean that it's movable. > > > Regards, > Nick Maclaren. A while back I corresponded witgh Uzi Vishkin, the PRAM guy, about the following: I created a theoretical computation model I call the ILP-PRAM. Basically, a PRAM, with extensions to model what a really aggressive ILP processor could do. I don't know if I have those notes any more - I may have thrown them away, as I did so much, when I changed jobs. (Wanting to avoid any suspicion of NDA violations, I left behind much that was really personal). But basically I assumed that an ILP RAM had features such as a) if a compiler could parallelize, it could parallelize b) even if a compiler could not parallelize, it could parallelize using techniques familiar to those in this group. And then I proved that an ILP-PRAM and a PRAM were equivalent in power. At least to within polynomial. In the theory classes I had just taken, proofs of equivalence typically involved emulation. I suspect that you all can see how an implicitly ILP-PRAM can emulate an explicitly parallel PRAM, with equivalent efficiency, without even having to resort to an OOO tricks: Basically, have the ILP-PRAM execute a program that simulates the PRAM. Something along the lines of for each cycle for each processing element of the PRAM, p simulate_1_cycle(p) Since by definition the ILP-PRAM can execute independent code, such as the loop body of the simulation, in parallel, then the body of the simulation loop executes "efficiently". If you hypothesize that the ILP PRAM is less efficient than the PRAM, the inefficiency can only lie in the control. i.e. the for loops. And classic PRAMs have essentially the same "loops" - the cycle explicly. The for each processor kloop, not really - assuming that the work has already been distributed. But, the ILP-PRAM definition allows the same distribution. I.e. an ILP-PRAM can simulate a PRAM "efficiently". They are models of equivalent computational power. for this definition. --- Now, I admit that this somewhat begs the question: a logically single threaded ILP-RAM is as "efficient" as an explicitly parallel PRAM, if you have an explicitly parallel program. But I think it is interesting. --- Perhaps somebody here can help me out: years ago, around the dawn of the web, I found a paper that talked about incorporating communications cost into theoretical models of parallel computation. As I recall, they proved that with any realistic model of parallel computation, a similar equivalence in efficiency. However, for the life of me I can't find the reference. === Anyway, theoretical equivalence aside: both parallel and serial computation run into the same walls. However, computer architecture is a place where we worry about the constant multipliers for O(n) equivalent stuff. I do not mean to say that explicit parallelism is not a good thing. I just think that it is not as much of a savior as many would think. Parallelism can solve bigger problems, but it won't solve fundamentally hard problems. --- This being said, I keep thinking about "How does the world compute itself?" if not in parallel, with communication costs. I think that, unless quantum computation is fundamental (which may well be - but I think that you can define a quantum-ILP-RAM as well), and barring faster than light communication, that the fact that the world seems to compute itself must mean that the problems that it asking of itself are fundamentally solvable on whatever is the appropriate parallel model of computation and costs. Which is the thing that causes me to wonder against Rob Myer's diatribes about physical modelling. Unless... the universe has hidden dimensions that are used to increase the effective bandwidth of communication between the entities that constitute the world.