Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Philipp Klaus Krause <pkk@spth.de>
Newsgroups: comp.compilers
Subject: Re: Modern compilers for ye olde architectures
Date: Mon, 18 Oct 2021 08:35:34 +0200
Organization: Compilers Central
Lines: 45
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-10-032@comp.compilers>
References: <21-10-007@comp.compilers> <21-10-012@comp.compilers> <21-10-015@comp.compilers> <21-10-024@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="9843"; mail-complaints-to="abuse@iecc.com"
Keywords: code, history
Posted-Date: 21 Oct 2021 20:06:54 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-10-024@comp.compilers>
Content-Language: en-US
Xref: csiph.com comp.compilers:2740

Am 15.10.21 um 09:37 schrieb Anton Ertl:
> Philipp Klaus Krause <pkk@spth.de> writes:
>
> I am wondering about one thing in the empirical results in your paper:
> Why is the code size not monotonously falling with increased numbers
> of assignments?  Are these independent runs with different
> (pseudo-random) assignments?

The basic idea is that for some subsets of nodes in the control-flow
graph, we try all possible assignments to registers. The
tree-decomposition tells us which subsets that are, and how we can
assemble the locally optimal solutions into globally optimal ones.

Since the cost function gives us the real costs from code generation, we
are optimal wrt. to the optimization goal, e.g. code size. However, thre
ar three aspects, why code size might not fall monotonously with an
increased number of tried assignments:

1) If the peephole optimizer is active: The register allocation approach
can only consider code size after code generation, it doesn't know what
the peephole optimizer might do with the result.

2) The register allocator considers variables / parts of variables as
assigned to certain registers, or spilt. It doesn't know where spilt
variables will end up on the stack. It can thus consider the benefits of
coalescing for variables assigned to registers, but not for spilt variables.

3) We know that the number of assignments we need to consider to get a
provably optimal result is polynomial if the input is a structured
program. But the polynomial bound might be too big for practical
compilation, so we limit the number of assignments considered (via the
--max-allocs-per-node parameter to SDCC). In that case the allocator can
not give a provably optimal result. In some cases, it needs to make a
decision, on which assignments to condier further (and does so based on
incomplete information using a heuristic). Here, considering more
assignments can lead to a certain assignment considered previously no
lonjger being considered. E.g. when going from --max-allocs-per-node 10
to --max-allocs-per-node 100, we will now consider 100 assignments
instead of 10. But not all of the previously considered 10 might make it
into the 100 considered now. And in the end it might turn out that one
of the 10 might have given us a better result globally. The decision
which assignments to consider further us not random, but based on a
heuristic that mostly takes local information into account.

Philipp