Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Philipp Klaus Krause Newsgroups: comp.compilers Subject: Re: Modern compilers for ye olde architectures Date: Mon, 18 Oct 2021 08:35:34 +0200 Organization: Compilers Central Lines: 45 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <21-10-032@comp.compilers> References: <21-10-007@comp.compilers> <21-10-012@comp.compilers> <21-10-015@comp.compilers> <21-10-024@comp.compilers> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="9843"; mail-complaints-to="abuse@iecc.com" Keywords: code, history Posted-Date: 21 Oct 2021 20:06:54 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com In-Reply-To: <21-10-024@comp.compilers> Content-Language: en-US Xref: csiph.com comp.compilers:2740 Am 15.10.21 um 09:37 schrieb Anton Ertl: > Philipp Klaus Krause writes: > > I am wondering about one thing in the empirical results in your paper: > Why is the code size not monotonously falling with increased numbers > of assignments? Are these independent runs with different > (pseudo-random) assignments? The basic idea is that for some subsets of nodes in the control-flow graph, we try all possible assignments to registers. The tree-decomposition tells us which subsets that are, and how we can assemble the locally optimal solutions into globally optimal ones. Since the cost function gives us the real costs from code generation, we are optimal wrt. to the optimization goal, e.g. code size. However, thre ar three aspects, why code size might not fall monotonously with an increased number of tried assignments: 1) If the peephole optimizer is active: The register allocation approach can only consider code size after code generation, it doesn't know what the peephole optimizer might do with the result. 2) The register allocator considers variables / parts of variables as assigned to certain registers, or spilt. It doesn't know where spilt variables will end up on the stack. It can thus consider the benefits of coalescing for variables assigned to registers, but not for spilt variables. 3) We know that the number of assignments we need to consider to get a provably optimal result is polynomial if the input is a structured program. But the polynomial bound might be too big for practical compilation, so we limit the number of assignments considered (via the --max-allocs-per-node parameter to SDCC). In that case the allocator can not give a provably optimal result. In some cases, it needs to make a decision, on which assignments to condier further (and does so based on incomplete information using a heuristic). Here, considering more assignments can lead to a certain assignment considered previously no lonjger being considered. E.g. when going from --max-allocs-per-node 10 to --max-allocs-per-node 100, we will now consider 100 assignments instead of 10. But not all of the previously considered 10 might make it into the 100 considered now. And in the end it might turn out that one of the 10 might have given us a better result globally. The decision which assignments to consider further us not random, but based on a heuristic that mostly takes local information into account. Philipp