Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!lnews.iecc.com!nerds-end From: amker Newsgroups: comp.compilers Subject: Re: How to eliminate redundant constant move instructions Date: Tue, 1 Nov 2011 20:58:48 -0700 (PDT) Organization: Compilers Central Lines: 32 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-11-010@comp.compilers> References: <11-10-019@comp.compilers> <11-11-004@comp.compilers> NNTP-Posting-Host: lnews.iecc.com X-Trace: gal.iecc.com 1320288054 6065 64.57.183.34 (3 Nov 2011 02:40:54 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Thu, 3 Nov 2011 02:40:54 +0000 (UTC) Keywords: optimize, GCC Posted-Date: 02 Nov 2011 22:40:53 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:312 On Nov 2, 2:32 am, George Neuner wrote: > It's very hard to tell anything without more context - we need to know > what CPU, what compiler, and we need to see the surrounding code. Sorry for the misleading message, the test case comes from a reported gcc bug, at: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44025 > -fcprop-register, which is a peephole pass that eliminates redundant > register moves (introduced by other optimizations), but that is Yes, I found that pass, and seems it can solve the problem if I: 1, extend the pass in a value numbering way, at least for const values; 2, extend the pass in a global data analysis way; > performed after register allocation. After register allocation also brings advantages, like no register pressure issue. > You have to remember that many CPUs can execute multiple > instructions in parallel, and those parallel instruction streams may > be executed out of order with respect to a program listing. This is What I have missed. But in this manner I will never know which codes is better since the performance depends on scheduling and out-of- ordering... right? Thanks for your explanation.