Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2657 > unrolled thread

8086 register allocation

Started by"Alexei A. Frounze" <alexfrunews@gmail.com>
First post2021-05-09 14:28 -0700
Last post2021-05-11 06:44 +0000
Articles 7 — 5 participants

Back to article view | Back to comp.compilers


Contents

  8086 register allocation "Alexei A. Frounze" <alexfrunews@gmail.com> - 2021-05-09 14:28 -0700
    Re: 8086 register allocation Fernando <pronesto@gmail.com> - 2021-05-10 04:46 -0700
    Re: 8086 register allocation gah4 <gah4@u.washington.edu> - 2021-05-10 14:49 -0700
      Re: 8086 register allocation Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2021-05-11 03:35 +0200
        Re: 8086 register allocation gah4 <gah4@u.washington.edu> - 2021-05-10 21:19 -0700
        Re: 8086 register allocation Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2021-05-11 09:35 +0200
      Re: 8086 register allocation Thomas Koenig <tkoenig@netcologne.de> - 2021-05-11 06:44 +0000

#2657 — 8086 register allocation

From"Alexei A. Frounze" <alexfrunews@gmail.com>
Date2021-05-09 14:28 -0700
Subject8086 register allocation
Message-ID<21-05-005@comp.compilers>
For the fun of it I've implemented a greedy bottom-up local register
allocator for the intel 8086 CPU, which is known for its non-uniform
use of registers in ALU instructions and memory operands
(https://github.com/alexfru/regal86).

The motivation for it is that I haven't seen any such register
allocator in e.g. the "Dragon" book and may other texts on compilers
I've read or skimmed. And even today I can't quite come up with a
proper web search request that would return info on such allocators.
I've found one paper that very briefly talks about an allocator like
this for the IBM 370. This gives a sense that this stuff is very
yesterday in terms of both CPU architectures and modern compiler
algorithms since many seem to delegate this particular aspect of
register allocation to graph (pre)coloring and modern CPUs are more
uniform in terms of register use (or there are simply more registers
to use).

Are there any other texts that introduce and explain such register
allocators? Compiler Construction by William M. Waite and Gerhard Goos
(section 10.2.2 "Targeting") is a bit too short on the matter. But
people somehow have done things like this in the past.

Thanks,
Alex
[I don't recall anything in a textbook about it.  x86 register
allocators were pretty ad-hoc due to all of the special cases
where an operand had to be in a specific register. -John]

[toc] | [next] | [standalone]


#2658

FromFernando <pronesto@gmail.com>
Date2021-05-10 04:46 -0700
Message-ID<21-05-006@comp.compilers>
In reply to#2657
Hi Alex,

I read part of your code. Very nice! The core algorithm (disregarding the x86 specific part) seems like 'local register allocation' using Belady's heuristic for spilling. I have a class about it here:

https://youtu.be/XmNXeNCGSIA

There has been some academic work about models for register allocation for x86. Two papers follow below:

* Register Allocation by Puzzle Solving, https://llvm.org/pubs/2008-06-PLDI-PuzzleSolving.pdf

* A generalized algorithm for graph-coloring register allocation, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.10.3076&rep=rep1&type=pdf

Regards,

Fernando

> For the fun of it I've implemented a greedy bottom-up local register
> allocator for the intel 8086 CPU, which is known for its non-uniform
> use of registers in ALU instructions and memory operands
> (https://github.com/alexfru/regal86).

[toc] | [prev] | [next] | [standalone]


#2659

Fromgah4 <gah4@u.washington.edu>
Date2021-05-10 14:49 -0700
Message-ID<21-05-007@comp.compilers>
In reply to#2657
On Sunday, May 9, 2021 at 5:04:18 PM UTC-7, alexf...@gmail.com wrote:
> For the fun of it I've implemented a greedy bottom-up local register
> allocator for the intel 8086 CPU, which is known for its non-uniform
> use of registers in ALU instructions and memory operands
> (https://github.com/alexfru/regal86).

Reminds me of what, as far as I know, is also an unsolved problem:
register allocation for the x87.

First the 8087 was designed not knowing if it should be a stack or
register machine, so stack registers are addressable.  (Even though
the addresses keep changing.)

It was also designed to have a virtual stack, which would spill to memory
on overflow, and back on underflow.  That sounds nice, but it seems that
no-one tried to write the interrupt routine before the hardware was built,
and that it actually isn't possible.  It seems that it isn't possible to get some
of the state bits set, such that it all works like a seamless virtual stack.

The 80287 is the same processing logic with a different bus interface
(and separate clock).  As well as I know, things were redesigned later,
but it seems that there still is no virtual stack.

There are description of some compilers that give up on compiling any
code that needs more than eight registers.

[toc] | [prev] | [next] | [standalone]


#2660

FromHans-Peter Diettrich <DrDiettrich1@netscape.net>
Date2021-05-11 03:35 +0200
Message-ID<21-05-008@comp.compilers>
In reply to#2659
On 5/10/21 11:49 PM, gah4 wrote:

> It was also designed to have a virtual stack, which would spill to memory
> on overflow, and back on underflow.  That sounds nice, but it seems that
> no-one tried to write the interrupt routine before the hardware was built,
> and that it actually isn't possible.  It seems that it isn't possible to get some
> of the state bits set, such that it all works like a seamless virtual stack.

How efficient is a virtual stack? Did anybody try to use ordinary
(integral...) registers with interrupt driven spilling?

A stack machine is convenient for calculations. Before the stack
overflows the compiler can save intermediate results, as with any other
architecture of limited register count.

DoDi
[Normal stack machines have the top few entries in registers and do the
spilling to memory in hardware.  The x87 stack has 8 registers, which is
a lot for a stack machine, but the spilling was broken.   You can address
into the stack but you can't really use it as a register machine. -John]

[toc] | [prev] | [next] | [standalone]


#2661

Fromgah4 <gah4@u.washington.edu>
Date2021-05-10 21:19 -0700
Message-ID<21-05-009@comp.compilers>
In reply to#2660
(Snip on x87 register allocation)

> [Normal stack machines have the top few entries in registers and do the
> spilling to memory in hardware. The x87 stack has 8 registers, which is
> a lot for a stack machine, but the spilling was broken. You can address
> into the stack but you can't really use it as a register machine. -John]

It was some years ago, but I read the story about the gcc code generator,
which is designed for register machines.  So, they don't quite treat it
as a register machine, but not a stack machine, either.

At any point in the code, there should be a known (constant) number
of items on the stack, so you can address the registers using that number.
I think that is what Intel expected, when they wrote that you can use
it as a register machine.  It would be less fun in assembly, though.

[toc] | [prev] | [next] | [standalone]


#2663

FromHans-Peter Diettrich <DrDiettrich1@netscape.net>
Date2021-05-11 09:35 +0200
Message-ID<21-05-011@comp.compilers>
In reply to#2660
On 5/11/21 3:35 AM, Hans-Peter Diettrich wrote:

> A stack machine is convenient for calculations. Before the stack
> overflows the compiler can save intermediate results, as with any other
> architecture of limited register count.
>
> DoDi
> [Normal stack machines have the top few entries in registers and do the
> spilling to memory in hardware.  The x87 stack has 8 registers, which is
> a lot for a stack machine, but the spilling was broken.   You can address
> into the stack but you can't really use it as a register machine. -John]

FORTH coders know how inconvenient for humans is accessing "local
variables" in such a stack. But a compiler can track the content of the
stack.

I had some problems in understanding "spilling". After re-reading the
80287 instruction set I found that only ST(0) is usable for memory
load/store operations, making it hard to code load/store the other end
of the register stack. Thanks, John, for the kick... ;-)

DoDi

[toc] | [prev] | [next] | [standalone]


#2662

FromThomas Koenig <tkoenig@netcologne.de>
Date2021-05-11 06:44 +0000
Message-ID<21-05-010@comp.compilers>
In reply to#2659
gah4 <gah4@u.washington.edu> schrieb:

[80x87]

> There are description of some compilers that give up on compiling any
> code that needs more than eight registers.

Even worse.  If I remember correctly, Turbo Pascal 3 simply
generated wrong code if the number of stack slots was exceeded.
The user just had to make sure that the formulas were not too
complex.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.compilers


csiph-web