Groups > comp.compilers > #2707 > unrolled thread

Memory organisation of the Z80 Turbo Pascal compiler

Started by	Lasse Hillerøe Petersen <lhp+news@toft-hp.dk>
First post	2021-09-29 11:40 +0000
Last post	2021-10-02 20:45 +0000
Articles	6 — 4 participants

Back to article view | Back to comp.compilers

  Memory organisation of the Z80 Turbo Pascal compiler Lasse Hillerøe Petersen <lhp+news@toft-hp.dk> - 2021-09-29 11:40 +0000
    Re: Memory organisation of the Z80 Turbo Pascal compiler Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2021-09-30 11:15 +0200
      Re: Memory organisation of the Z80 Turbo Pascal compiler Lasse Hillerøe Petersen <lhp+news@toft-hp.dk> - 2021-10-02 10:38 +0000
        Re: Memory organisation of the Z80 Turbo Pascal compiler gah4 <gah4@u.washington.edu> - 2021-10-02 13:08 -0700
        Re: Memory organisation of the Z80 Turbo Pascal compiler Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2021-10-03 02:06 +0200
    Re: Memory organisation of the Z80 Turbo Pascal compiler anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2021-10-02 20:45 +0000

#2707 — Memory organisation of the Z80 Turbo Pascal compiler

From	Lasse Hillerøe Petersen <lhp+news@toft-hp.dk>
Date	2021-09-29 11:40 +0000
Subject	Memory organisation of the Z80 Turbo Pascal compiler
Message-ID	<21-09-014@comp.compilers>

I have been thinking about the old TurboPascal compiler (actually about
its predecessor COMPAS Pascal, which I used long ago, but I believe the
original Z80 CP/M TurboPascal was exactly the same in this regard.)

In the TP 2.0 manual it says:
  "The recursion stack is used only by recursive procedures and
  functions, i.e. procedures and functions compiled with the
  A compiler directive passive (~A-}). On entry to a recursive
  subprogram it copies its workspace onto the recursion stack,
  and on exit the entire workspace is restored to its original
  state. The default initial value of RecurPtr at the beginning
  of a program, is 1 K ($400) bytes below the CPU stack pointer."

In other words, all procedures in a program have their own statically
located "workspace", and when recursive calls are made, the previously
activation's local variables are saved away. I suppose this was practical
on the Z80, because indexed operations were rather expensive and not
relative to SP, using the separate IX and IY registers, whereas an LDIR
instruction was reasonably efficient.

As far as I can tell, this works great even for a language with nested
lexical scope like Pascal. When an inner procedure refers to a containing
procedure's local variable, it will get the currently active instance.

However, I think there would be an issue when passing procedure local
variables as VAR parameters in combination with some forms of mutual and/
or nested recursion, as the address passed would refer to the right
workspace, but the workspace might contain the wrong activation instance.
I'm not completely sure about this, and I don't have a setup at the
moment where I could experiment and test it.

In any case, I have never found any description of this style of scope
management, since I actually used this compiler back in the 80es, and I
don't know if it even has a name?

Other than the copying (which may be considered inefficient, but only
applies to recursive procedures) and the possible issue with passing
references ending up pointing to the wrong variable, what would be the
pros and cons of this method?

/Lasse

[toc] | [next] | [standalone]

#2709

From	Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Date	2021-09-30 11:15 +0200
Message-ID	<21-10-001@comp.compilers>
In reply to	#2707

On 9/29/21 1:40 PM, Lasse Hillerøe Petersen wrote:
> I have been thinking about the old TurboPascal compiler (actually about
> its predecessor COMPAS Pascal, which I used long ago, but I believe the
> original Z80 CP/M TurboPascal was exactly the same in this regard.)
>
> In the TP 2.0 manual it says:
>    "The recursion stack is used only by recursive procedures and
>    functions, i.e. procedures and functions compiled with the
>    A compiler directive passive (~A-}). On entry to a recursive
>    subprogram it copies its workspace onto the recursion stack,
>    and on exit the entire workspace is restored to its original
>    state. The default initial value of RecurPtr at the beginning
>    of a program, is 1 K ($400) bytes below the CPU stack pointer."

The famous GfA-Basic for Atari ST and Amiga used a similar strange
handling of local variables as global variables. On entry of a
subroutine all "local" variables were saved on the stack and restored on
exit. In between all references within the entire code used the local
variables of that subroutine instance in the global workspace. This way
every variable of a unique name and type had a fixed address at runtime.

[...]
> Other than the copying (which may be considered inefficient, but only
> applies to recursive procedures) and the possible issue with passing
> references ending up pointing to the wrong variable, what would be the
> pros and cons of this method?

I don't remember whether GfA Basic included pointers or distinct ref/val
parameters. At least it was a damn fast framework on those 68k
processors, in detail compared to 8086 processors of that time.

A con of the GfA method was found in the later PC version that used the
traditional and compatible way of storing local variables in the
stackframe. This broke some legacy code, sooner or later, and I ended up
with a set of subroutines that could not be converted into any other
language. Unfortunately this was the tricky analysis of complex
conditional expressions in IF (WHILE...) statements of my C decompiler :-(

DoDi

[toc] | [prev] | [next] | [standalone]

#2710

From	Lasse Hillerøe Petersen <lhp+news@toft-hp.dk>
Date	2021-10-02 10:38 +0000
Message-ID	<21-10-002@comp.compilers>
In reply to	#2709

On Thu, 30 Sep 2021 11:15:32 +0200, Hans-Peter Diettrich wrote:

> I don't remember whether GfA Basic included pointers or distinct ref/val
> parameters. At least it was a damn fast framework on those 68k
> processors, in detail compared to 8086 processors of that time.

Interesting. I'd have thought that on the 68K architecture you would
always prefer using stack based frames, given its many addressing modes
and address registers.

> A con of the GfA method was found in the later PC version that used the
> traditional and compatible way of storing local variables in the
> stackframe. This broke some legacy code, sooner or later, and I ended up
> with a set of subroutines that could not be converted into any other
> language. Unfortunately this was the tricky analysis of complex
> conditional expressions in IF (WHILE...) statements of my C decompiler

So you depended on the "feature"? I don't quite understand how you
managed to do that.

I just noticed that in the TP manual, the second paragraph down on the
page I quoted from actually says:
 "Because of this technique, variables local to a subprogram must not
  be used as var parameters in recursive calls."

I quoted the TP manual, as at the time I couldn't find the original
Danish COMPAS manual, and it was the "official" English translation.
However, the corresponding page in the COMPAS 2.0 manual(COMPAS 3.0
probably being equivalent to TP 1.0) does not have this caveat, so I
guess Anders Hejlsberg had not noticed this problem with the save-restore
method until then. Ah, the follies of youth. :-)

Given these problems, I suppose the method was not deemed very "useful"
by compiler designers. I still wonder if there are more examples of its
use, and if there is a name for it, or maybe literature describing it.

An "exercise" set that I guess I will try to figure out now, and which
others might find entertaining:
1. Is there a simple way to solve the problem, like figuring out where on
the routine stack the variable passed by reference will be during the
recusive call, and just pass that address instead?
2. How does nested subroutines fit into the method? Should locals of a
nested subroutine be considered part of the locals of the containing
subroutine? What is the situation with the case of indirect recursion,
where the inner calls its containing subroutine?
3. Are there other fun ways this method can be used, exploited or abused?

(Now where did I put my Z80 CP/M emulator? :-) )
[This isn't the same thing, but in IBM 360 Fortran, the suroutine prolog copied scalar
parameters into locals and epilog copied them out.  It didn't have indirect
addressing so that let it use the same base register as for the other locals.  Fortran
didn't allow recursive calls but if you used aliased arguments you could get confusing
results. -John]

[toc] | [prev] | [next] | [standalone]

#2711

From	gah4 <gah4@u.washington.edu>
Date	2021-10-02 13:08 -0700
Message-ID	<21-10-003@comp.compilers>
In reply to	#2710

On Saturday, October 2, 2021 at 10:35:45 AM UTC-7, Lasse Hillerøe Petersen
wrote:

(snip)

> (Now where did I put my Z80 CP/M emulator? :-) )
> [This isn't the same thing, but in IBM 360 Fortran, the suroutine prolog copied scalar
> parameters into locals and epilog copied them out. It didn't have indirect
> addressing so that let it use the same base register as for the other locals. Fortran
> didn't allow recursive calls but if you used aliased arguments you could get
> confusing results. -John]

This comes up in comp.lang.fortran every time someone mentions that Fortran
uses pass-by-reference. Fortran still allows for this calling method.
(If you put slashes around the dummy argument, OS/360 uses actual pass by
reference.)

Also, in some cases Fortran requires a contiguous array, such that a copy
is made before a subroutine call, and copy back after return, in the case of
a (potentially) discontiguous array.

Since PL/I has internal procedures, a procedure pointer (ENTRY variable in
PL/I terms) includes a reference to the appropriate set of variables
available to callers of such.  (And it even allows nested internal procedures.)

Getting that right slowed down the addition of pointers to internal
subroutines to the Fortran standard by many years.

[toc] | [prev] | [next] | [standalone]

#2714

From	Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Date	2021-10-03 02:06 +0200
Message-ID	<21-10-006@comp.compilers>
In reply to	#2710

On 10/2/21 12:38 PM, Lasse Hillerøe Petersen wrote:
> On Thu, 30 Sep 2021 11:15:32 +0200, Hans-Peter Diettrich wrote:
>
>> I don't remember whether GfA Basic included pointers or distinct ref/val
>> parameters. At least it was a damn fast framework on those 68k
>> processors, in detail compared to 8086 processors of that time.
>
> Interesting. I'd have thought that on the 68K architecture you would
> always prefer using stack based frames, given its many addressing modes
> and address registers.

Much 68k software, particularly compilers, was poorly ported 8 bit (CP/M?)
code. E.g. one compiler only used the 68k address registers because "an
int is a pointer, a pointer is an int". For expression evaluation the
operands of each binary operation were copied from memory to the address
registers (A0, A1), from there to the data registers (D0, D1) and
afterwards back again, typically in subroutines at least for logical
operators. This way a simple single statement C function generated 3
pages of assembly listing, and that mess made me start the C decompiler
development. At that time C was very new to me, and none of the
compilers had ever heard about ANSI C. So I did all work in GfA Basic
and was happy with it. After a few years I had a decompiler for
executables and libraries of half a dozen C compilers, including Amiga
and HP-UX.

>> A con of the GfA method was found in the later PC version that used the
>> traditional and compatible way of storing local variables in the
>> stackframe. This broke some legacy code, sooner or later, and I ended up
>> with a set of subroutines that could not be converted into any other
>> language. Unfortunately this was the tricky analysis of complex
>> conditional expressions in IF (WHILE...) statements of my C decompiler
>
> So you depended on the "feature"? I don't quite understand how you
> managed to do that.

It was by accident, because at that time I couldn't know how it could be
done otherwise. All my practical experience with homecomputers was Basic
and machine language, having seen compilers (Algol, Cobol...) only as a
user on the TR-440. At least the Gfa method was such a clean and
efficient approach that Occams Razor...

DoDi

[toc] | [prev] | [next] | [standalone]

#2712

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2021-10-02 20:45 +0000
Message-ID	<21-10-004@comp.compilers>
In reply to	#2707

Lasse =?iso-8859-1?q?Hiller=F8e?= Petersen <lhp+news@toft-hp.dk> writes:
>I have been thinking about the old TurboPascal compiler (actually about
>its predecessor COMPAS Pascal, which I used long ago, but I believe the
>original Z80 CP/M TurboPascal was exactly the same in this regard.)
>
>In the TP 2.0 manual it says:
>  "The recursion stack is used only by recursive procedures and
>  functions, i.e. procedures and functions compiled with the
>  A compiler directive passive (~A-}). On entry to a recursive
>  subprogram it copies its workspace onto the recursion stack,
>  and on exit the entire workspace is restored to its original
>  state. The default initial value of RecurPtr at the beginning
>  of a program, is 1 K ($400) bytes below the CPU stack pointer."
>
>In other words, all procedures in a program have their own statically
>located "workspace", and when recursive calls are made, the previously
>activation's local variables are saved away. I suppose this was practical
>on the Z80, because indexed operations were rather expensive and not
>relative to SP, using the separate IX and IY registers, whereas an LDIR
>instruction was reasonably efficient.
>
>As far as I can tell, this works great even for a language with nested
>lexical scope like Pascal.

No, it does not give you lexical scoping.  Try the man-or-boy test
<http://rosettacode.org/wiki/Man_or_boy_test#Pascal>, or the following
Lisp example:

testr[x,p,f,u] <- if p[x] then f[x]
                  else if atom[x] then u[]
                  else testr[cdr[x],p,f,lambda:testr[car[x],p,f,u]].

This is M-expression syntax for Lisp 2.0, which did not catch on;
McCarthy gave this example (written by James R. Slagle) as the case
that showed that Lisp's implementation of the time did not implement
lexical scoping, which McCarthy had intended, and which this program
relies on.

>When an inner procedure refers to a containing
>procedure's local variable, it will get the currently active instance.

Which can be the wrong one.

>However, I think there would be an issue when passing procedure local
>variables as VAR parameters in combination with some forms of mutual and/
>or nested recursion, as the address passed would refer to the right
>workspace, but the workspace might contain the wrong activation instance.

That is another problem, but note that the Lisp routine above uses
call-by-value.

>In any case, I have never found any description of this style of scope
>management, since I actually used this compiler back in the 80es, and I
>don't know if it even has a name?

My guess is that in the early days many compilers used similar
techniques, because many compilers failed the man-or-boy test.  But
given that it is incorrect as implementation technique for lexical
scoping, I doubt that these techniques have been given names.

>Other than the copying (which may be considered inefficient, but only
>applies to recursive procedures) and the possible issue with passing
>references ending up pointing to the wrong variable, what would be the
>pros and cons of this method?

Apart from the lexical scoping problem, it is also not particularly
efficient in data memory: It consumes space for all local variables of
all functions at all times (plus the recursion stack), whereas
implementations using activation frames only keep as many as are alive
at the same time.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/

[toc] | [prev] | [standalone]

csiph-web

Memory organisation of the Z80 Turbo Pascal compiler

Contents

#2707 — Memory organisation of the Z80 Turbo Pascal compiler

#2709

#2710

#2711

#2714

#2712