Path: csiph.com!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Kaz Kylheku <217-679-0842@kylheku.com>
Newsgroups: comp.compilers
Subject: Re: Add nested-function support in a language the based on a stack-machine
Date: Thu, 1 Mar 2018 19:26:06 +0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 122
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <18-03-006@comp.compilers>
References: <18-02-009@comp.compilers> <18-02-012@comp.compilers> <18-02-016@comp.compilers> <18-02-018@comp.compilers> <18-02-023@comp.compilers> <18-02-029@comp.compilers> <18-02-032@comp.compilers> <18-02-034@comp.compilers> <18-03-002@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="69259"; mail-complaints-to="abuse@iecc.com"
Keywords: code, design, comment
Posted-Date: 01 Mar 2018 18:47:14 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Xref: csiph.com comp.compilers:1968

On 2018-03-01, George Neuner <gneuner2@comcast.net> wrote:
> On Sat, 17 Feb 2018 16:39:04 GMT, anton@mips.complang.tuwien.ac.at
> (Anton Ertl) wrote:
>
>>Kaz Kylheku <217-679-0842@kylheku.com> writes:
>>>On 2018-02-15, George Neuner <gneuner2@comcast.net> wrote:
>>>> No worries.  IME, displays don't get much respect from modern
>>>> textbooks - they are mentioned mostly in passing.
>>>
>>>This is probably because of
>>
>>This is because displays were found to be more costly for Algol-like
>>languages
>
> More costly than what?
>
>>(there was an influential paper that convinced everyone of
>>that).  IIRC the additional cost is in updating the display on calls
>>and returns.
>
> Which is no different from the overhead to maintain static links, and
> faster to use when you need non-locals more than one lexical level
> distant.
>
>>Unfortunately, I don't have a reference to the paper above.  You can
>>probably find a reference in old compiler books.
>
> I'd really like to see that paper to find out what they are comparing
> to display [and using what language].
>
> It certainly is true that closeure conversion is a better solution
> overall because it flattens all non-local accesses to use a single
> indirection ... but it also significantly complicates the compiler,
> and creates a need for handling wide pointers (or equivalent).  Many
> useful languages simply don't need that extra complexity.

How do flat representations handle the fact that different levels
of the lexical environment have different lifetimes?

E.g.

   (let ((a 1) (b 2))
     (loop ...
       (let ((c 3) (d 4))
         (lambda ()))))

The (c d) environment is freshly instantiated each time the loop
re-enters the inner let, but the (a b) environment is the same.

Each (lambda ()) invocation must capture a different environment,
with a different (c d) but shared (a b).

If there is just a flat vector, it must still contain an extra
level of indirection, no?

If the language is purely functional, then the bindings are immutable,
and so the (lambda ()) can copy the whole damn thing.

If bindings are mutable, then when one lambda modifies a or b,
the other lambdas (with their different environment) must see the
modification. Yet they are shielded from each other's modification
of c or d.

So the upshot is that there has to be an extra level of indirection.

It cannot be the case that "all non-local accesses use a single
indirection".

The flat environment has to contain entries which are not the storage
bindings themselves, but pointers to bindings.

For instance (let's keep it simple and concrete) it could be a vector of
conses:

  #((a . 1) (b . 2) (c . 3) (d . 4))

This is not direct access: we have to dereference a pointer to index
into the vector.  Then we retrieve a cons cell which is a pointer,
we have to dereference this to get to the cdr field to get to the value.

So then whenever the loop enters the (c d) environment, the the (c ...)
and (e ...) conses are destructively replaced with fresh conses.

What the lambda operator then has to do to fetch the environment
for the closure is to take a shallow-copy snapshot of the vector:
allocate a new vector of the same length which takes the same conses.

So if instead of the above, we have a display, we can do this:

   #(#(1 2) #(3 4)) ;; env vec: effectively, a two-level display

Instead of dereferencing a vector and then a cons cell, we
dereference a vector and then an inner vector.

In this display case (pun intended) when the inner let is entered,
the #(3 4) is replaced with #(nil nil), destructively,
analogous to the conses being replaced in the "flat" case.
Note that this is likely more efficient than allocating two conses.

The lambda operator shallow-copies the whole vector, just like
in the flat environment case.

> I would never use static chains ... if display wasn't viable I would
> jump straight to closure conversion.

What is the meaning of "display", anyway?

It seems like it just means "array" instead of "linked list" (just
specifically in the context of activation frame environments).

The C main function's argv[] is a "display" of strings, whereas
the Lisp list '("bash" "-c" "ls") is a "chain".

Except that the context isn't environment access and so we don't use
the word "display".
[A display is an array of pointers to the frames of the enclosing
scopes.  You can access any variable in an enclosing frame by
indirecting through the frame's entry in the current display and then
indexing.  This is for languges like Algol and Pascal, not Lisp.  I
don't ever recall anyone talking about displays and threads, although
I think they should work OK so long as you keep your displays on the
local stack. -John]