Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!lnews.iecc.com!nerds-end
From: glen herrmannsfeldt <gah@ugcs.caltech.edu>
Newsgroups: comp.compilers
Subject: Re: How to eliminate redundant constant move instructions
Date: Thu, 3 Nov 2011 03:32:52 +0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 50
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <11-11-019@comp.compilers>
References: <11-10-019@comp.compilers> <11-11-004@comp.compilers> <11-11-005@comp.compilers> <11-11-014@comp.compilers>
NNTP-Posting-Host: lnews.iecc.com
X-Trace: gal.iecc.com 1320349657 9272 64.57.183.34 (3 Nov 2011 19:47:37 GMT)
X-Complaints-To: abuse@iecc.com
NNTP-Posting-Date: Thu, 3 Nov 2011 19:47:37 +0000 (UTC)
Keywords: optimize
Posted-Date: 03 Nov 2011 15:47:36 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: x330-a1.tempe.blueboxinc.net comp.compilers:321

George Neuner <gneuner2@comcast.net> wrote:

(snip, then I wrote)
>>That is what register renaming is for.  Usually using more than
>>the architecturally specified number of registers, the CPU
>>internally remaps the registers such that it can keep one value
>>in a register while an instruction is being executed out of order.

> Yes.  But the compiler can't count on register renaming ... it can see
> only the architectural named registers.

The comment came after a follow-up on out-of-order execution.

Yes, you can't count on out-of-order, or register renaming, but
for out-of-order to work you usually also need register renaming.

What usually happens, though, is that a dynamic programming algorithm
is used to select an optimal instruction sequence given the
appropriate weights (costs) for each instruction sequence.

On the other hand, if there is a tie then dynamic programming will
choose one, which may look less than optimal to a human.

There have been stories back to the first Fortran compiler on people
being surprised to see optimized generated code better than the
compiler writers might have written by hand.

I remember a friend being assigned to write optimal assembly language
code for a Fortran DO loop, and then I ran the loop through the
Fortran H compiler which generated one fewer instruction.  (It was a
very small loop, where it might have been four instead of five
instructions.)

If you are selling programs, you usually don't compile for the exact
processor, but some compromise.  For personal use, you might know the
exact processor, but even so, with modern processors and overlapped
(or out-of-order) it is hard to choose the optimal instruction
sequence.

> If the code in question had > copied Rx-> Ry then renaming would
have been possible, but instead the > code performed a constant load
to each register.  No possibility of > rename sharing there.

I know the IBM 360/91 would recognize loads from the same address, and
use register renaming to avoid the second load, or load following
store.  (That is, if the first load or store was still in the
appropriate buffer.)  But then you shouldn't clear registers by
loading zero from memory.

-- glen