Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!lnews.iecc.com!nerds-end From: glen herrmannsfeldt Newsgroups: comp.compilers Subject: Re: How to eliminate redundant constant move instructions Date: Thu, 3 Nov 2011 03:32:52 +0000 (UTC) Organization: Aioe.org NNTP Server Lines: 50 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-11-019@comp.compilers> References: <11-10-019@comp.compilers> <11-11-004@comp.compilers> <11-11-005@comp.compilers> <11-11-014@comp.compilers> NNTP-Posting-Host: lnews.iecc.com X-Trace: gal.iecc.com 1320349657 9272 64.57.183.34 (3 Nov 2011 19:47:37 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Thu, 3 Nov 2011 19:47:37 +0000 (UTC) Keywords: optimize Posted-Date: 03 Nov 2011 15:47:36 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:321 George Neuner wrote: (snip, then I wrote) >>That is what register renaming is for. Usually using more than >>the architecturally specified number of registers, the CPU >>internally remaps the registers such that it can keep one value >>in a register while an instruction is being executed out of order. > Yes. But the compiler can't count on register renaming ... it can see > only the architectural named registers. The comment came after a follow-up on out-of-order execution. Yes, you can't count on out-of-order, or register renaming, but for out-of-order to work you usually also need register renaming. What usually happens, though, is that a dynamic programming algorithm is used to select an optimal instruction sequence given the appropriate weights (costs) for each instruction sequence. On the other hand, if there is a tie then dynamic programming will choose one, which may look less than optimal to a human. There have been stories back to the first Fortran compiler on people being surprised to see optimized generated code better than the compiler writers might have written by hand. I remember a friend being assigned to write optimal assembly language code for a Fortran DO loop, and then I ran the loop through the Fortran H compiler which generated one fewer instruction. (It was a very small loop, where it might have been four instead of five instructions.) If you are selling programs, you usually don't compile for the exact processor, but some compromise. For personal use, you might know the exact processor, but even so, with modern processors and overlapped (or out-of-order) it is hard to choose the optimal instruction sequence. > If the code in question had > copied Rx-> Ry then renaming would have been possible, but instead the > code performed a constant load to each register. No possibility of > rename sharing there. I know the IBM 360/91 would recognize loads from the same address, and use register renaming to avoid the second load, or load following store. (That is, if the first load or store was still in the appropriate buffer.) But then you shouldn't clear registers by loading zero from memory. -- glen