Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!gegeweb.org!news.glorb.com!border3.nntp.dca.giganews.com!Xl.tags.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!local2.nntp.dca.giganews.com!news.giganews.com.POSTED!not-for-mail NNTP-Posting-Date: Wed, 11 Jan 2012 11:25:35 -0600 Message-ID: <4F0DC609.9020503@SPAM.comp-arch.net> Date: Wed, 11 Jan 2012 09:25:29 -0800 From: "Andy (Super) Glew" Reply-To: andy@SPAM.comp-arch.net Organization: comp-arch.net User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:9.0) Gecko/20111222 Thunderbird/9.0.1 MIME-Version: 1.0 Newsgroups: comp.lang.forth,comp.sys.intel,comp.arch Subject: Re: Can someone explain step by step how one avoid many conditional in forth as described in Moore Fourth essay? References: <19111298.516.1326191150632.JavaMail.geo-discussion-forums@yqbu38> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Lines: 44 X-Usenet-Provider: http://www.giganews.com X-Trace: sv3-tu5GWIgUDJOvuE+VAo19wierIAKJ/U9hohPX2+KiLNWOpJS4SySAHHhzAeJNtA5v1HdFRk28+9DfHce!9GucyIGYlnTVYeVzJSdTYV006ADaHvcw9qCR95tR00pfGuKvijR8tiGjdOnKpvo= X-Complaints-To: abuse@giganews.com X-DMCA-Notifications: http://www.giganews.com/info/dmca.html X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly X-Postfilter: 1.3.40 X-Original-Bytes: 3537 Xref: x330-a1.tempe.blueboxinc.net comp.lang.forth:8814 comp.sys.intel:164 comp.arch:5366 On 1/11/2012 4:29 AM, Alex McDonald wrote: > On Jan 11, 12:21 pm, "Rod Pemberton" > wrote: >> "Arnold Doray" wrote in message >> >> news:jehfpu$9de$1@dont-email.me... >> ... >> >>>> [Forth code] >> >>> CPU pipelining is improved by reducing conditionals. Modern CPUs have >>> branch prediction, but these aren't always successful, in which case the >>> pipline needs to be flushed, lowering the CPU's throughput. >> >> Is that still true for multiple cores? >> >> I.e., I would think the following is entirely possible and plausible, but I >> haven't studied a CPU design in decades. The CPU's designers could execute >> the process in parallel with one core taking one direction for the branch >> and the another core taking the other branch direction. Once the correct >> branch is determined, the bad execution path is discarded. If the primary >> core had the good execution path, it just continues execution. If the >> alternate core had the good execution path, it's internal state could be >> "pushed" to the primary core. If they used static ram for the internal >> state, then it could be "pushed" asynchronously, i.e., between clocks or >> sub-clocks. It would require reserving a core for the branch path >> execution, at least temporarily. >> >> Rod Pemberton > > Search on "speculative execution". It can also be done at the compiler > level; that doesn't require processor support. Intel's P6 was their > first chip to support it iirc. Taking both paths on a branch is currently called "eager execution". Branch prediction is one form of "speculative execution". P6 did speculative, but not eager. As, for that matter, did P5 (Pentium). I don't know of anyone doing full eager execution, although it has been studied out the bejeezus. Branch prediction usually beats it. I believe an IBM chip did eager ifetch - fetching both sides of a branch - but did not actually execute, stalled at decoder or therabouts.