Groups > comp.lang.forth > #7657 > unrolled thread

Help with RAFTS-like optimiser

Started by	Alex McDonald <blog@rivadpm.com>
First post	2011-11-30 14:52 -0800
Last post	2011-12-19 09:29 -0800
Articles	20 on this page of 22 — 4 participants

Back to article view | Back to comp.lang.forth

  Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-11-30 14:52 -0800
    Re: Help with RAFTS-like optimiser anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-02 16:36 +0000
      Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-02 12:43 -0800
        Re: Help with RAFTS-like optimiser Pablo Hugo Reda <pabloreda@gmail.com> - 2011-12-02 14:42 -0800
          Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-02 14:47 -0800
            Re: Help with RAFTS-like optimiser Pablo Hugo Reda <pabloreda@gmail.com> - 2011-12-02 15:02 -0800
        Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-04 18:00 -0800
        Re: Help with RAFTS-like optimiser anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-06 15:44 +0000
          Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-06 12:27 -0800
            Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-06 14:27 -0800
            Re: Help with RAFTS-like optimiser anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-07 13:15 +0000
        Re: Help with RAFTS-like optimiser (long-ish) Alex McDonald <blog@rivadpm.com> - 2012-01-09 05:50 -0800
          Re: Help with RAFTS-like optimiser (long-ish) Alex McDonald <blog@rivadpm.com> - 2012-01-09 06:09 -0800
          Re: Help with RAFTS-like optimiser (long-ish) anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-01-09 15:07 +0000
            Re: Help with RAFTS-like optimiser (long-ish) Alex McDonald <blog@rivadpm.com> - 2012-01-09 13:17 -0800
              Re: Help with RAFTS-like optimiser (long-ish) Alex McDonald <blog@rivadpm.com> - 2012-01-10 08:53 -0800
            Re: Help with RAFTS-like optimiser (long-ish) Alex McDonald <blog@rivadpm.com> - 2012-01-16 07:14 -0800
      Re: Help with RAFTS-like optimiser George Hubert <georgeahubert@yahoo.co.uk> - 2011-12-04 02:37 -0800
        Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-04 11:13 -0800
        Re: Help with RAFTS-like optimiser anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-12-06 17:43 +0000
    Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-19 07:28 -0800
      Re: Help with RAFTS-like optimiser Alex McDonald <blog@rivadpm.com> - 2011-12-19 09:29 -0800

Page 1 of 2 [1] 2 Next page →

#7657 — Help with RAFTS-like optimiser

From	Alex McDonald <blog@rivadpm.com>
Date	2011-11-30 14:52 -0800
Subject	Help with RAFTS-like optimiser
Message-ID	<73f8f07d-0a1e-4e6a-8456-fb93b2f093fe@p9g2000vbb.googlegroups.com>

I've resurrected my RAFTS-like optimiser, and need a little advice.
http://www.complang.tuwien.ac.at/projects/rafts.html

Basics;

The stack grows to the left. Each cell is initialised to its own
address, and overwritten as items are manipulated on the stack.

     stack-left        stack-ptr                  stack-right
        v                 v                          v
+--------+--------+--------+--------+--------+--------+--------+
|        |        |        |        |        |        |        |
+--------+--------+--------+--------+--------+--------+--------+
                 ^
              stack-0

. stack-0 is the stack at entry
. stack-left and right are used to reset the stack; it avoids having
to clear all the cells.
. stack-left represents the maximum depth of the stack. It is always
<= to stack-ptr.
. stack-right is the righmost referenced stack entry, used to
determine how far into the input stack there are modifications or
references. It is always >= to stack-ptr.

The example shown above would have the stack diagram ( 3 -- 2 ).

Here's some sample debug output walking the tree in preorder, with an
example from one of the papers;  (l= and r= are left and right
pointers for each entry, op= is the primitive operator type and
addresses starting $144xxx are tree entries and those starting $806xxx
are stack addresses)

dup 10 + @ 20 + @ over 20 + @ < not

diagram ( 1 -- 2 )
stack entry $806450 -> $144C08
  $144C08 : l=$144BE8 r=$0 op=not
    $144BE8 : l=$144BC8 r=$144B68 op=<
      $144BC8 : l=$144BA8 r=$0 op=@
        $144BA8 : l=$144B88 r=$806454 op=+
          $144B88 : l=$0 r=$0 op=literal v=20
      $144B68 : l=$144B48 r=$0 op=@
        $144B48 : l=$144B28 r=$144B08 op=+
          $144B28 : l=$0 r=$0 op=literal v=20
          $144B08 : l=$144AE8 r=$0 op=@
            $144AE8 : l=$144AC8 r=$806454 op=+
              $144AC8 : l=$0 r=$0 op=literal v=10
stack entry $806454 -> $806454

The last stack entry indicates that there is a read reference to this
stack entry; it's caused by the first DUP and can be seen referenced
in the + operators.

Now, assuming that the code is generated from right to left -- for
example, if the stack diagram is ( A B -- C D E )  generate code for E
then D then C -- are there any dependencies that the stack entries may
have between each other? In other words, if E is generated first, is
there any way of D or C containing dependencies on the original value
of the stack cell that E will overwrite? I'm ignoring ! (store)
dependencies for the moment.

[toc] | [next] | [standalone]

#7690

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-02 16:36 +0000
Message-ID	<2011Dec2.173602@mips.complang.tuwien.ac.at>
In reply to	#7657

Alex McDonald <blog@rivadpm.com> writes:
>The example shown above would have the stack diagram ( 3 -- 2 ).

Do you mean something like ( x1 x2 x3 -- y1 y2 )?  If so, the stack
diagram is possible given the other stuff you wrote, but does not
follow from it (other stack diagrams are possible, too).

>Now, assuming that the code is generated from right to left -- for
>example, if the stack diagram is ( A B -- C D E )  generate code for E
>then D then C -- are there any dependencies that the stack entries may
>have between each other? In other words, if E is generated first, is
>there any way of D or C containing dependencies on the original value
>of the stack cell that E will overwrite? I'm ignoring ! (store)
>dependencies for the moment.

In ( A B -- C D E ), the easiest allocation is to allocate C at the
same place as A and D at the same place as B, and E at some new place.
Thener there obviously is no such write-after-read dependence; you
might introduce one, though, depending on how you handle intermediate
values (but you don't mean these by "original value", do you?).

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#7698

From	Alex McDonald <blog@rivadpm.com>
Date	2011-12-02 12:43 -0800
Message-ID	<e355ab99-8b9c-42c5-a3cd-0c7337fe7d76@i6g2000vbe.googlegroups.com>
In reply to	#7690

On Dec 2, 4:36 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >The example shown above would have the stack diagram ( 3 -- 2 ).
>
> Do you mean something like ( x1 x2 x3 -- y1 y2 )?  If so, the stack
> diagram is possible given the other stuff you wrote, but does not
> follow from it (other stack diagrams are possible, too).

I've been using this shorthand for so long, I forgot to explain it.
Yes, 3 in and 2 out as you elaborate.

>
> >Now, assuming that the code is generated from right to left -- for
> >example, if the stack diagram is ( A B -- C D E )  generate code for E
> >then D then C -- are there any dependencies that the stack entries may
> >have between each other? In other words, if E is generated first, is
> >there any way of D or C containing dependencies on the original value
> >of the stack cell that E will overwrite? I'm ignoring ! (store)
> >dependencies for the moment.
>
> In ( A B -- C D E ), the easiest allocation is to allocate C at the
> same place as A and D at the same place as B, and E at some new place.
> Thener there obviously is no such write-after-read dependence; you
> might introduce one, though, depending on how you handle intermediate
> values (but you don't mean these by "original value", do you?).

Intermediate trees are the issue I was trying to describe. This;

X Y TUCK 1 +

results in the forest Y X Y+1, and the result of the tree at leftmost
Y needs to be available as a subtree of rightmost Y+1.

stack-reset
: x y1 @ y2 @ tuck 1 + ;
cr  stack-walk

diagram ( 0 -- 3 )

stack entry -3 $806448 -> $144B68
| $144B48 : op=[ literal ] v=1
| | $144B08 : op=[ literal ] v=[ y2 ]
| $144B28 : l=$144B08 op=[ @ ]
$144B68 : l=$144B48 r=$144B28 op=[ + ]

stack entry -2 $80644C -> $144AE8
| $144AC8 : op=[ literal ] v=[ y1 ]
$144AE8 : l=$144AC8 op=[ @ ]

stack entry -1 $806450 -> $144B28
| $144B08 : op=[ literal ] v=[ y2 ]
$144B28 : l=$144B08 op=[ @ ]

The entry at (-1 $806450) is a subtree of (-3 $806448) for the IL at
$144B28. In fact, anything that moves a stack entry "leftwards" --
swap, tuck, rot and so on -- can generate these dependencies. What's
the best way of managing these subtrees; marking with a reference
count?

>
> - anton
> --
> M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
>      New standard:http://www.forth200x.org/forth200x.html
>    EuroForth 2011:http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#7700

From	Pablo Hugo Reda <pabloreda@gmail.com>
Date	2011-12-02 14:42 -0800
Message-ID	<388b65f7-8224-4792-80e4-6a0b9a2d66a3@h42g2000yqd.googlegroups.com>
In reply to	#7698

Alex

I'don know if I can help but I current working in static stack
analisis,
for you first code with modification (I not have <) I generate this

:test  |  a -- bc   nivel:0 uso:-1 dR:0 dD:1
DUP           | AB ;  2010020
10              | ABC ;  30000FF
+               | AB ;  3020021 T=10 N=cpy(1)
@              | AB ;  2000010
20              | ABD ;  40000FF
+               | AB ;  4020021 T=20 N=cpy(1)
@              | AB ;  2000010
OVER        | ABE ;  5000140
20              | ABEF ;  60000FF
+               | ABE ;  6050021 T=20 N=cpy(1)
@              | ABE ;  5000010
-                | AB ;  5020020
NOT           | AB ;  2000010
;                | AB ;  D0

|-- 6 Registros --
 -2 | A =stack S -1-0
 0 | B =cpy MS<6> 0-0
 1 | C =cte ** 1-2
 1 | D =cte ** 4-5
 0 | E =cpy M<2> 7-11
 1 | F =cte ** 8-9
|-----------------
:  |   -- a   nivel:1 uso:0 dR:0 dD:1
test            | A ;  1E1     ** T=0
;               | A ;  DF      ** T=0

|-- 1 Registros --
 -2 | A =any+ S 0-0
|-----------------

My plan now is assign register to "Registros" (place in stack)

[toc] | [prev] | [next] | [standalone]

#7701

From	Alex McDonald <blog@rivadpm.com>
Date	2011-12-02 14:47 -0800
Message-ID	<4295bcfb-d936-4660-972e-ddac1d1a253d@n10g2000vbg.googlegroups.com>
In reply to	#7700

On Dec 2, 10:42 pm, Pablo Hugo Reda <pablor...@gmail.com> wrote:
> Alex
>
> I'don know if I can help but I current working in static stack
> analisis,
> for you first code with modification (I not have <) I generate this
>
> :test  |  a -- bc   nivel:0 uso:-1 dR:0 dD:1
> DUP           | AB ;  2010020
> 10              | ABC ;  30000FF
> +               | AB ;  3020021 T=10 N=cpy(1)
> @              | AB ;  2000010
> 20              | ABD ;  40000FF
> +               | AB ;  4020021 T=20 N=cpy(1)
> @              | AB ;  2000010
> OVER        | ABE ;  5000140
> 20              | ABEF ;  60000FF
> +               | ABE ;  6050021 T=20 N=cpy(1)
> @              | ABE ;  5000010
> -                | AB ;  5020020
> NOT           | AB ;  2000010
> ;                | AB ;  D0
>
> |-- 6 Registros --
>  -2 | A =stack S -1-0
>  0 | B =cpy MS<6> 0-0
>  1 | C =cte ** 1-2
>  1 | D =cte ** 4-5
>  0 | E =cpy M<2> 7-11
>  1 | F =cte ** 8-9
> |-----------------
> :  |   -- a   nivel:1 uso:0 dR:0 dD:1
> test            | A ;  1E1     ** T=0
> ;               | A ;  DF      ** T=0
>
> |-- 1 Registros --
>  -2 | A =any+ S 0-0
> |-----------------
>
> My plan now is assign register to "Registros" (place in stack)

What do you get for

variable y1
variable y2
: y1 y2 tuck 1 + ;

?

[toc] | [prev] | [next] | [standalone]

#7704

From	Pablo Hugo Reda <pabloreda@gmail.com>
Date	2011-12-02 15:02 -0800
Message-ID	<e7bb7ca6-867e-45c4-a0f1-db606d4006a3@h3g2000yqa.googlegroups.com>
In reply to	#7701

> What do you get for
>
> variable y1
> variable y2
> : y1 y2 tuck 1 + ;
>
> ?

in my case y1 get the value, for adress I use 'y1...

#y1 #y2
:test2
y1 y2 swap over 1 + ;

:test2  |   -- abc   nivel:0 uso:0 dR:0 dD:3
y1              | A ;  10000F0
y2              | AB ;  20000F0
SWAP            | BA ;  2010020
OVER            | BAC ;  3000240
1               | BACD ;  40000FF
+               | BAC ;  4030021 T=1 N=cpy(2)
;               | BAC ;  D0

|-- 4 Registros --
 -2 | A =var S 0-0
 -2 | B =var S 1-0
 0 | C =cpy S<1> 3-0
 1 | D =cte ** 4-5
|-----------------

[toc] | [prev] | [next] | [standalone]

#7730

From	Alex McDonald <blog@rivadpm.com>
Date	2011-12-04 18:00 -0800
Message-ID	<b7a562e4-1c74-4133-9ad0-b93ff9c20d7b@y18g2000yqy.googlegroups.com>
In reply to	#7698

On Dec 2, 3:43 pm, Alex McDonald <b...@rivadpm.com> wrote:
> On Dec 2, 4:36 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> wrote:
>
> > Alex McDonald <b...@rivadpm.com> writes:
> > >The example shown above would have the stack diagram ( 3 -- 2 ).
>
> > Do you mean something like ( x1 x2 x3 -- y1 y2 )?  If so, the stack
> > diagram is possible given the other stuff you wrote, but does not
> > follow from it (other stack diagrams are possible, too).
>
> I've been using this shorthand for so long, I forgot to explain it.
> Yes, 3 in and 2 out as you elaborate.
>
>
>
> > >Now, assuming that the code is generated from right to left -- for
> > >example, if the stack diagram is ( A B -- C D E )  generate code for E
> > >then D then C -- are there any dependencies that the stack entries may
> > >have between each other? In other words, if E is generated first, is
> > >there any way of D or C containing dependencies on the original value
> > >of the stack cell that E will overwrite? I'm ignoring ! (store)
> > >dependencies for the moment.
>
> > In ( A B -- C D E ), the easiest allocation is to allocate C at the
> > same place as A and D at the same place as B, and E at some new place.
> > Thener there obviously is no such write-after-read dependence; you
> > might introduce one, though, depending on how you handle intermediate
> > values (but you don't mean these by "original value", do you?).
>
> Intermediate trees are the issue I was trying to describe. This;
>
> X Y TUCK 1 +
>
> results in the forest Y X Y+1, and the result of the tree at leftmost
> Y needs to be available as a subtree of rightmost Y+1.
>
> stack-reset
> : x y1 @ y2 @ tuck 1 + ;
> cr  stack-walk
>
> diagram ( 0 -- 3 )
>
> stack entry -3 $806448 -> $144B68
> | $144B48 : op=[ literal ] v=1
> | | $144B08 : op=[ literal ] v=[ y2 ]
> | $144B28 : l=$144B08 op=[ @ ]
> $144B68 : l=$144B48 r=$144B28 op=[ + ]
>
> stack entry -2 $80644C -> $144AE8
> | $144AC8 : op=[ literal ] v=[ y1 ]
> $144AE8 : l=$144AC8 op=[ @ ]
>
> stack entry -1 $806450 -> $144B28
> | $144B08 : op=[ literal ] v=[ y2 ]
> $144B28 : l=$144B08 op=[ @ ]
>
> The entry at (-1 $806450) is a subtree of (-3 $806448) for the IL at
> $144B28. In fact, anything that moves a stack entry "leftwards" --
> swap, tuck, rot and so on -- can generate these dependencies. What's
> the best way of managing these subtrees; marking with a reference
> count?
>
>
>
>
>
>
>
>
>
> > - anton
> > --
> > M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> > comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
> >      New standard:http://www.forth200x.org/forth200x.html
> >    EuroForth 2011:http://www.euroforth.org/ef11/

Ignore this. I've just read and re-read the RAFTS papers, and I'm
ahead of myself here.

[toc] | [prev] | [next] | [standalone]

#7763

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-06 15:44 +0000
Message-ID	<2011Dec6.164419@mips.complang.tuwien.ac.at>
In reply to	#7698

Alex McDonald <blog@rivadpm.com> writes:
>X Y TUCK 1 +
>
>results in the forest Y X Y+1, and the result of the tree at leftmost
>Y needs to be available as a subtree of rightmost Y+1.
...
>The entry at (-1 $806450) is a subtree of (-3 $806448) for the IL at
>$144B28. In fact, anything that moves a stack entry "leftwards" --
>swap, tuck, rot and so on -- can generate these dependencies. What's
>the best way of managing these subtrees; marking with a reference
>count?

I am not sure what you mean with "managing" here.

Anyway, I guess one thing you are wondering about is how to generate
code if you allocate the stack items in a way that leads to cycles in
the dependence graph coming from write-after-read dependences.  That's
what Stephen Pelc calls the generic stack shuffle or somesuch.

I asked the question many years ago in comp.compilers, but cannot find
the thread now.  Anyway, the answer I got pointed to [burger+95] IIRC.
"Greedy shuffling" is the interesting part here: Build a dependence
graph; if there are no cycles, do a topological sort and process the
nodes in that order.  If there is a cycle, you have to break it by
introducing a move of one of the things in the cycle to a temporary
location; repeat until you have no more cycles, then do the
topological sort etc.

An alternative, simpler approach that might avoid the need for
complicated graph algorithms, but typically introduce more moves is to
introduce moves from all the input items to temporary locations (or
from temporary locations to the output items) right at the start, then
generate code, and try to put the temporary location in the
corresponding input (or output) location during register allocation
(this is called coalescing in register allocation literature) if
possible (it's not possible for all temporary locations if there are
cycles).

@InProceedings{burger+95,
  author = 	 {Robert G. Burger and Oscar Waddell and R. Kent Dybvig},
  title = 	 {Register Allocation Using Lazy Saves, Eager
                  Restores, and Greedy Shuffling},
  crossref =	 "sigplan95",
  pages =	 {130--138}
}

@Proceedings{sigplan95,
  booktitle = 	 "SIGPLAN '95 Conference on Programming Language
		  Design and Implementation",
  title = 	 "SIGPLAN '95 Conference on Programming Language
		  Design and Implementation",
  year = 	 "1995",
  key = 	 "SIGPLAN '95"
}

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#7770

From	Alex McDonald <blog@rivadpm.com>
Date	2011-12-06 12:27 -0800
Message-ID	<b4228b60-68c3-46dc-a841-3feb52d33033@z12g2000yqm.googlegroups.com>
In reply to	#7763

On Dec 6, 10:44 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >X Y TUCK 1 +
>
> >results in the forest Y X Y+1, and the result of the tree at leftmost
> >Y needs to be available as a subtree of rightmost Y+1.
> ...
> >The entry at (-1 $806450) is a subtree of (-3 $806448) for the IL at
> >$144B28. In fact, anything that moves a stack entry "leftwards" --
> >swap, tuck, rot and so on -- can generate these dependencies. What's
> >the best way of managing these subtrees; marking with a reference
> >count?
>
> I am not sure what you mean with "managing" here.

A figure of speech; how should they be processed...

>
> Anyway, I guess one thing you are wondering about is how to generate
> code if you allocate the stack items in a way that leads to cycles in
> the dependence graph coming from write-after-read dependences.  That's
> what Stephen Pelc calls the generic stack shuffle or somesuch.

Yes, the dreaded : foo rot ; I discovered cycles when I walked that
and waited forever, and I hadn't spotted a reference to them in the
RAFTS papers.

There are also read-after-write dependencies; for example, the
(artificial) : foo var 10 over ! @ ;

For dependency cases I maintain a left-to-right ordered list of
interleaved stores and fetches, which I will add when walking the
forest on the pseudo stacks. Deciding what comes before and after
appears trivial, since the entries generated for both IL entries and
the dependency list are allocated at ascending addresses as we move
from left to right through the parse; the dependencies and the trees
can be subsequently processed in that order.

>
> I asked the question many years ago in comp.compilers, but cannot find
> the thread now.  Anyway, the answer I got pointed to [burger+95] IIRC.
> "Greedy shuffling" is the interesting part here: Build a dependence
> graph; if there are no cycles, do a topological sort and process the
> nodes in that order.  If there is a cycle, you have to break it by
> introducing a move of one of the things in the cycle to a temporary
> location; repeat until you have no more cycles, then do the
> topological sort etc.

I found some interesting literature on cycle detection and removing
one of the edges. It's relatively straightforward to detect and
remove; what's less clear is which edge to remove, and I'm not yet
sure if it matters or not. I'll read the links you provide. There's
also this paper; http://www.me.utexas.edu/~bard/IP/Handouts/cycles.pdf

>
> An alternative, simpler approach that might avoid the need for
> complicated graph algorithms, but typically introduce more moves is to
> introduce moves from all the input items to temporary locations (or
> from temporary locations to the output items) right at the start, then
> generate code, and try to put the temporary location in the
> corresponding input (or output) location during register allocation
> (this is called coalescing in register allocation literature) if
> possible (it's not possible for all temporary locations if there are
> cycles).
>
> @InProceedings{burger+95,
>   author =       {Robert G. Burger and Oscar Waddell and R. Kent Dybvig},
>   title =        {Register Allocation Using Lazy Saves, Eager
>                   Restores, and Greedy Shuffling},
>   crossref =     "sigplan95",
>   pages =        {130--138}
>
> }
>
> @Proceedings{sigplan95,
>   booktitle =    "SIGPLAN '95 Conference on Programming Language
>                   Design and Implementation",
>   title =        "SIGPLAN '95 Conference on Programming Language
>                   Design and Implementation",
>   year =         "1995",
>   key =          "SIGPLAN '95"
>
> }
>
> - anton
> --
> M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
>      New standard:http://www.forth200x.org/forth200x.html
>    EuroForth 2011:http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#7771

From	Alex McDonald <blog@rivadpm.com>
Date	2011-12-06 14:27 -0800
Message-ID	<52caa9b2-7666-441e-83bf-dbbb332263e8@x7g2000yqb.googlegroups.com>
In reply to	#7770

On Dec 6, 3:27 pm, Alex McDonald <b...@rivadpm.com> wrote:
> On Dec 6, 10:44 am, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> wrote:
>
> > Alex McDonald <b...@rivadpm.com> writes:
> > >X Y TUCK 1 +
>
> > >results in the forest Y X Y+1, and the result of the tree at leftmost
> > >Y needs to be available as a subtree of rightmost Y+1.
> > ...
> > >The entry at (-1 $806450) is a subtree of (-3 $806448) for the IL at
> > >$144B28. In fact, anything that moves a stack entry "leftwards" --
> > >swap, tuck, rot and so on -- can generate these dependencies. What's
> > >the best way of managing these subtrees; marking with a reference
> > >count?
>
> > I am not sure what you mean with "managing" here.
>
> A figure of speech; how should they be processed...
>
>
>
> > Anyway, I guess one thing you are wondering about is how to generate
> > code if you allocate the stack items in a way that leads to cycles in
> > the dependence graph coming from write-after-read dependences.  That's
> > what Stephen Pelc calls the generic stack shuffle or somesuch.
>
> Yes, the dreaded : foo rot ; I discovered cycles when I walked that
> and waited forever, and I hadn't spotted a reference to them in the
> RAFTS papers.
>
> There are also read-after-write dependencies; for example, the
> (artificial) : foo var 10 over ! @ ;
>
> For dependency cases I maintain a left-to-right ordered list of
> interleaved stores and fetches, which I will add when walking the
> forest on the pseudo stacks. Deciding what comes before and after
> appears trivial, since the entries generated for both IL entries and
> the dependency list are allocated at ascending addresses as we move
> from left to right through the parse; the dependencies and the trees
> can be subsequently processed in that order.
>
>
>
> > I asked the question many years ago in comp.compilers, but cannot find
> > the thread now.  Anyway, the answer I got pointed to [burger+95] IIRC.
> > "Greedy shuffling" is the interesting part here: Build a dependence
> > graph; if there are no cycles, do a topological sort and process the
> > nodes in that order.  If there is a cycle, you have to break it by
> > introducing a move of one of the things in the cycle to a temporary
> > location; repeat until you have no more cycles, then do the
> > topological sort etc.
>
> I found some interesting literature on cycle detection and removing
> one of the edges. It's relatively straightforward to detect and
> remove; what's less clear is which edge to remove, and I'm not yet
> sure if it matters or not. I'll read the links you provide. There's
> also this paper;http://www.me.utexas.edu/~bard/IP/Handouts/cycles.pdf
>
>

From the paper you referenced;

If we detect a cycle, we find the argument causing the largest number
of dependencies and remove it, placing it into a temporary location in
hopes of breaking the cycle.

Which would tend to indicate that a reference counter may be helpful
in deciding on the edge to remove.

>
>
>
>
>
>
>
> > An alternative, simpler approach that might avoid the need for
> > complicated graph algorithms, but typically introduce more moves is to
> > introduce moves from all the input items to temporary locations (or
> > from temporary locations to the output items) right at the start, then
> > generate code, and try to put the temporary location in the
> > corresponding input (or output) location during register allocation
> > (this is called coalescing in register allocation literature) if
> > possible (it's not possible for all temporary locations if there are
> > cycles).
>
> > @InProceedings{burger+95,
> >   author =       {Robert G. Burger and Oscar Waddell and R. Kent Dybvig},
> >   title =        {Register Allocation Using Lazy Saves, Eager
> >                   Restores, and Greedy Shuffling},
> >   crossref =     "sigplan95",
> >   pages =        {130--138}
>
> > }
>
> > @Proceedings{sigplan95,
> >   booktitle =    "SIGPLAN '95 Conference on Programming Language
> >                   Design and Implementation",
> >   title =        "SIGPLAN '95 Conference on Programming Language
> >                   Design and Implementation",
> >   year =         "1995",
> >   key =          "SIGPLAN '95"
>
> > }
>
> > - anton
> > --
> > M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> > comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
> >      New standard:http://www.forth200x.org/forth200x.html
> >    EuroForth 2011:http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#7781

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-07 13:15 +0000
Message-ID	<2011Dec7.141527@mips.complang.tuwien.ac.at>
In reply to	#7770

Alex McDonald <blog@rivadpm.com> writes:
>Yes, the dreaded : foo rot ; I discovered cycles when I walked that
>and waited forever, and I hadn't spotted a reference to them in the
>RAFTS papers.

Actually, there's a pretty detailed description in Section 4.4.1 of
<http://www.complang.tuwien.ac.at/papers/ertl&pirker97.ps.gz>.  If you
have any questions about that, just ask.

>I found some interesting literature on cycle detection and removing
>one of the edges. It's relatively straightforward to detect and
>remove; what's less clear is which edge to remove, and I'm not yet
>sure if it matters or not.

It does not matter wrt correctness, but it does matter wrt
performance.  Burger et al. (see my last posting) give a heuristic for
selecting a low-cost edge to break.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#8755 — Re: Help with RAFTS-like optimiser (long-ish)

From	Alex McDonald <blog@rivadpm.com>
Date	2012-01-09 05:50 -0800
Subject	Re: Help with RAFTS-like optimiser (long-ish)
Message-ID	<b59320c4-81cb-482f-baec-53b1268d6a13@k5g2000vba.googlegroups.com>
In reply to	#7698

On Dec 2 2011, 8:43 pm, Alex McDonald <b...@rivadpm.com> wrote:

[snip]

Progress report & a question.

Sample output below. The DAG building phase for basic blocks is nearly
complete (although I'm still playing with varieties of IL opcodes).
The code has a certain amount of refinement from the previous
incarnation that's done during this first pass;

. Duplicate literal loading elimination; all references to a literal
are encoded as a single virtual register load
. Constant folding;
    1 2 + is reduced to 3
. Dead store elimination (for literals and variables directly or
through the stack);
    10 Y1 ! 20 Y1 !  is reduced to 20 Y1 !
. Redundant fetch elimination and correct handling of store/fetch/
store ordering on the same variable (again, for literals and variables
directly or through the stack);
    Y1 ! Y1 @ 20 Y1 ! is reduced to 20 Y1 !
. Reference counting for IL nodes and define/use/release information
for virtual regs are implemented.

The core minus debugging stuff is about 250 LOC, and it's surprisingly
clean (wow! a first for me) although not ANS compliant as it currently
uses smart compilation features of my Forth to cause compilation to
trigger the optimiser.

The question (which is I suppose for Anton; please advise me if you'd
prefer a direct email?).

As the IL structures are assigned from ascending addresses as the
parse proceeds, by sorting the pointers from the stack, the rstack and
the store list into ascending order I generate a valid DAG. So all
stores/fetches etc appear to be executed in the correct order using a
preorder descent (with visit marking to prevent duplicate generation)
through each tree in the DAG.

The IL field "il-depends (dependencies for scheduling)" in the paper
below I have dispensed with because of this technique. Am I
misinterpreting what this field is for?

[EP97] M. Anton Ertl and Christian Pirker. The structure of a Forth
native code compiler. In EuroForth '97 Conference Proceedings, pages
107-116, Oxford, 1997. http://www.complang.tuwien.ac.at/papers/ertl%26pirker97.ps.gz


Examples ($80xxxx are variable addresses. $14Fxxxx are IL nodes for
debugging; ignore them);

: x3 5 y1 ! y1 @ 4 + y1 ! y1 @ ;

diagram ( S: 0 -- 1 R: 0 -- 0 )

---- generation
  1: [$14F00D0] r1 <-- $9 literal
  2: [$14F004C] r2 <-- $8068F4 literal
  3: [$14F00FC]  --> r1 ! r2
  4: S[-1] <-- r1 i-sp!
  5: rsp <-- i-sp+ -1
  6: rrp <-- i-rp+ 0
---- end generation
regs required: 2


: x1 y1 @ y2 @ tuck 1 + ;

diagram ( S: 0 -- 3 R: 0 -- 0 )

---- generation
  1: [$14F0020] r1 <-- $8068F4 literal
  2: [$14F004C] r2 <-- @ r1
  3: [$14F0078] r3 <-- $8068F8 literal
  4: [$14F00A4] r4 <-- @ r3
  5: [$14F00D0] r5 <-- $1 literal
  6: [$14F00FC] r6 <-- r4 + r5
  7: S[-3] <-- r6 i-sp!
  8: S[-1] <-- r4 i-sp!
  9: S[-2] <-- r2 i-sp!
  10: rsp <-- i-sp+ -3
  11: rrp <-- i-rp+ 0
---- end generation
regs required: 6


: x2 y3 over dup  y1 @ + y2 @ + < 0= ! ;

diagram ( S: 1 -- 1 R: 0 -- 0 )

---- generation
  1: [$14F018C] r1 <-- $0 literal
  2: [$14F00EC] r2 <-- $8068F8 literal
  3: [$14F0118] r3 <-- @ r2
  4: [$14F0070] r4 <-- $8068F4 literal
  5: [$14F009C] r5 <-- @ r4
  6: [$14F004C] r6 <-- S[0] i-sp@
  7: [$14F00C8] r7 <-- r6 + r5
  8: [$14F0144] r8 <-- r7 + r3
  9: [$14F0168] r9 <-- r6 < r8
  10: [$14F01B8] r10 <-- r9 = r1
  11: [$14F0020] r11 <-- $8068FC literal
  12: [$14F01DC]  --> r11 ! r10
  13: rsp <-- i-sp+ 0
  14: rrp <-- i-rp+ 0
---- end generation
regs required: 11


: x4 1 2 + y1 ! 3 tuck ; .s

diagram ( S: 1 -- 3 R: 0 -- 0 )

---- generation
  1: [$14F0078] r1 <-- $3 literal
  2: [$14F00A4] r2 <-- $8068F4 literal
  3: [$14F00D0]  --> r1 ! r2
  4: [$14F00FC] r3 <-- S[0] i-sp@
  5: S[-1] <-- r3 i-sp!
  6: S[0] <-- r1 i-sp!
  7: S[-2] <-- r1 i-sp!
  8: rsp <-- i-sp+ -2
  9: rrp <-- i-rp+ 0
---- end generation
regs required: 3


: x6 10 2>r r@ 2r> r> ;

diagram ( S: 1 -- 4 R: 1 -- 0 )

---- generation
  1: [$14F0020] r1 <-- $10 literal
  2: [$14F004C] r2 <-- S[0] i-sp@
  3: [$14F0070] r3 <-- R[0] i-rp@
  4: S[-3] <-- r3 i-sp!
  5: S[-1] <-- r2 i-sp!
  6: S[0] <-- r1 i-sp!
  7: S[-2] <-- r1 i-sp!
  8: rsp <-- i-sp+ -3
  9: rrp <-- i-rp+ 1
---- end generation
regs required: 3


: x9 dup 1+! @ ;

diagram ( S: 1 -- 1 R: 0 -- 0 )

---- generation
  1: [$14F0044] r1 <-- $1 literal
  2: [$14F0020] r2 <-- S[0] i-sp@
  3: [$14F0070] r3 <-- @ r2
  4: [$14F009C] r4 <-- r3 + r1
  5: [$14F00C0]  --> r4 ! r2
  6: S[0] <-- r4 i-sp!
  7: rsp <-- i-sp+ 0
  8: rrp <-- i-rp+ 0
---- end generation
regs required: 4

[toc] | [prev] | [next] | [standalone]

#8757 — Re: Help with RAFTS-like optimiser (long-ish)

From	Alex McDonald <blog@rivadpm.com>
Date	2012-01-09 06:09 -0800
Subject	Re: Help with RAFTS-like optimiser (long-ish)
Message-ID	<29c9327d-f774-4967-bfad-16c9af06ef58@z19g2000vbe.googlegroups.com>
In reply to	#8755

On Jan 9, 1:50 pm, Alex McDonald <b...@rivadpm.com> wrote:
> On Dec 2 2011, 8:43 pm, Alex McDonald <b...@rivadpm.com> wrote:
>
> [snip]
[snip]
>
> As the IL structures are assigned from ascending addresses as the
> parse proceeds, by sorting the pointers from the stack, the rstack and
> the store list into ascending order I generate a valid DAG. So all
> stores/fetches etc appear to be executed in the correct order using a
> preorder descent (with visit marking to prevent duplicate generation)

.^^^^^^^^^

Post order.

[snip]

> through each tree in the DAG.
>

[toc] | [prev] | [next] | [standalone]

#8758 — Re: Help with RAFTS-like optimiser (long-ish)

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2012-01-09 15:07 +0000
Subject	Re: Help with RAFTS-like optimiser (long-ish)
Message-ID	<2012Jan9.160727@mips.complang.tuwien.ac.at>
In reply to	#8755

Alex McDonald <blog@rivadpm.com> writes:
>The question (which is I suppose for Anton; please advise me if you'd
>prefer a direct email?).
>
>As the IL structures are assigned from ascending addresses as the
>parse proceeds, by sorting the pointers from the stack, the rstack and
>the store list into ascending order I generate a valid DAG. So all
>stores/fetches etc appear to be executed in the correct order using a
>preorder descent (with visit marking to prevent duplicate generation)
>through each tree in the DAG.
>
>The IL field "il-depends (dependencies for scheduling)" in the paper
>below I have dispensed with because of this technique. Am I
>misinterpreting what this field is for?

I think you have it right.

The advantage of your method is that, if you somehow keep the order
across code selection, you don't need to worry about translating the
dependence edges from the IL graph to the ML graph.  Another advantage
is that the result is a valid schedule already, so you could skip
instruction scheduling (especially on OoO machines where the usual
scheduling does not buy much).  And some of the algorithms that walk
the DAG could just visit the nodes sequentially.

The disadvantage is that you now have to work quite a bit more to
distinguish dependences from incidential ordering.  This might be
particularly troublesome in cases where the instruction selector may
need to forego combining several IL nodes because of dependences.
Here's an untested example:

variable a
variable b

: foo ( addr1 addr2 -- )
  over @ 0 swap ! 1+ swap ! ;
\ this ^         and this ^ must not be combined

T{ 5 a ! a b foo a @ b @ -> 6 0 }T
T{ 5 a ! a a foo a @ -> 6 }T

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

#8761 — Re: Help with RAFTS-like optimiser (long-ish)

From	Alex McDonald <blog@rivadpm.com>
Date	2012-01-09 13:17 -0800
Subject	Re: Help with RAFTS-like optimiser (long-ish)
Message-ID	<48a98d10-8225-4a89-af8c-dcf319ff7262@o14g2000vbo.googlegroups.com>
In reply to	#8758

On Jan 9, 3:07 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >The question (which is I suppose for Anton; please advise me if you'd
> >prefer a direct email?).
>
> >As the IL structures are assigned from ascending addresses as the
> >parse proceeds, by sorting the pointers from the stack, the rstack and
> >the store list into ascending order I generate a valid DAG. So all
> >stores/fetches etc appear to be executed in the correct order using a
> >preorder descent (with visit marking to prevent duplicate generation)
> >through each tree in the DAG.
>
> >The IL field "il-depends (dependencies for scheduling)" in the paper
> >below I have dispensed with because of this technique. Am I
> >misinterpreting what this field is for?
>
> I think you have it right.
>
> The advantage of your method is that, if you somehow keep the order
> across code selection, you don't need to worry about translating the
> dependence edges from the IL graph to the ML graph.  Another advantage
> is that the result is a valid schedule already, so you could skip
> instruction scheduling (especially on OoO machines where the usual
> scheduling does not buy much).  And some of the algorithms that walk
> the DAG could just visit the nodes sequentially.
>
> The disadvantage is that you now have to work quite a bit more to
> distinguish dependences from incidential ordering.  This might be
> particularly troublesome in cases where the instruction selector may
> need to forego combining several IL nodes because of dependences.
> Here's an untested example:
>
> variable a
> variable b
>
> : foo ( addr1 addr2 -- )
>   over @ 0 swap ! 1+ swap ! ;
> \ this ^         and this ^ must not be combined
>
> T{ 5 a ! a b foo a @ b @ -> 6 0 }T
> T{ 5 a ! a a foo a @ -> 6 }T
>
> - anton
> --
> M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
>      New standard:http://www.forth200x.org/forth200x.html
>    EuroForth 2011:http://www.euroforth.org/ef11/

Thanks very much for the help. Yes, there's a problem with the
dependencies; this

 : foo ( addr1 addr2 -- )
  over @ 0 rot ! 1+ swap ! ;

returns 1 for the   5 a ! a a foo a @ .   case. I think it's fixable
without too much effort. (I did post a much longer response
demonstrating why the code was flawed, but I know you know why so I
shan't repeat it. It's disappeared; perhaps google groups will post it
shortly.)

[toc] | [prev] | [next] | [standalone]

#8770 — Re: Help with RAFTS-like optimiser (long-ish)

From	Alex McDonald <blog@rivadpm.com>
Date	2012-01-10 08:53 -0800
Subject	Re: Help with RAFTS-like optimiser (long-ish)
Message-ID	<0e5fbff9-2609-4521-9fc8-1b38f902b575@d10g2000vbh.googlegroups.com>
In reply to	#8761

On Jan 9, 9:17 pm, Alex McDonald <b...@rivadpm.com> wrote:
> On Jan 9, 3:07 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> wrote:
>
>
>
>
>
>
>
>
>
> > Alex McDonald <b...@rivadpm.com> writes:
> > >The question (which is I suppose for Anton; please advise me if you'd
> > >prefer a direct email?).
>
> > >As the IL structures are assigned from ascending addresses as the
> > >parse proceeds, by sorting the pointers from the stack, the rstack and
> > >the store list into ascending order I generate a valid DAG. So all
> > >stores/fetches etc appear to be executed in the correct order using a
> > >preorder descent (with visit marking to prevent duplicate generation)
> > >through each tree in the DAG.
>
> > >The IL field "il-depends (dependencies for scheduling)" in the paper
> > >below I have dispensed with because of this technique. Am I
> > >misinterpreting what this field is for?
>
> > I think you have it right.
>
> > The advantage of your method is that, if you somehow keep the order
> > across code selection, you don't need to worry about translating the
> > dependence edges from the IL graph to the ML graph.  Another advantage
> > is that the result is a valid schedule already, so you could skip
> > instruction scheduling (especially on OoO machines where the usual
> > scheduling does not buy much).  And some of the algorithms that walk
> > the DAG could just visit the nodes sequentially.
>
> > The disadvantage is that you now have to work quite a bit more to
> > distinguish dependences from incidential ordering.  This might be
> > particularly troublesome in cases where the instruction selector may
> > need to forego combining several IL nodes because of dependences.
> > Here's an untested example:
>
> > variable a
> > variable b
>
> > : foo ( addr1 addr2 -- )
> >   over @ 0 swap ! 1+ swap ! ;
> > \ this ^         and this ^ must not be combined
>
> > T{ 5 a ! a b foo a @ b @ -> 6 0 }T
> > T{ 5 a ! a a foo a @ -> 6 }T
>
> > - anton
> > --
> > M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
> > comp.lang.forth FAQs:http://www.complang.tuwien.ac.at/forth/faq/toc.html
> >      New standard:http://www.forth200x.org/forth200x.html
> >    EuroForth 2011:http://www.euroforth.org/ef11/
>
> Thanks very much for the help. Yes, there's a problem with the
> dependencies; this
>
>  : foo ( addr1 addr2 -- )
>   over @ 0 rot ! 1+ swap ! ;
>
> returns 1 for the   5 a ! a a foo a @ .   case. I think it's fixable
> without too much effort. (I did post a much longer response
> demonstrating why the code was flawed, but I know you know why so I
> shan't repeat it. It's disappeared; perhaps google groups will post it
> shortly.)

Fixed.

 : foo1 ( addr1 addr2 -- )
  over @ 0 rot ! 1+ swap ! ;

diagram ( S: 2 -- 0 R: 0 -- 0 )

---- generation
  1: [$14F0020] r5 <-- i-sp@ S[0]
  2: [$14F0044] r1 <-- i-sp@ S[1]
  3: [$14F0068] r3 <-- r1 @
  4: [$14F0094] r6 <-- literal $0
  5: [$14F00C0]  <-- r5 ! r6
  6: [$14F00EC] r2 <-- literal $1
  7: [$14F0118] r4 <-- r2 + r3
  8: [$14F013C]  <-- r1 ! r4
  9: [$14F0168]  -->  i-sp+ 2
---- end generation

5 a ! a a foo a @

1: r5 <-- &a    2: r1 <-- &a    3: r3 <-- 5    4: r6 <-- 0
5: 0 a !        6: r2 <-- 1     7: r4 <-- 6    8: 6 a !

The ! acts to generate through-stack fetch dependencies; in this case
line 5: for 3:.

[toc] | [prev] | [next] | [standalone]

#8908 — Re: Help with RAFTS-like optimiser (long-ish)

From	Alex McDonald <blog@rivadpm.com>
Date	2012-01-16 07:14 -0800
Subject	Re: Help with RAFTS-like optimiser (long-ish)
Message-ID	<bd30bd4f-5dd1-4295-bf5b-bdba0a42bb77@h3g2000yqe.googlegroups.com>
In reply to	#8758

On Jan 9, 3:07 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >The question (which is I suppose for Anton; please advise me if you'd
> >prefer a direct email?).
>
> >As the IL structures are assigned from ascending addresses as the
> >parse proceeds, by sorting the pointers from the stack, the rstack and
> >the store list into ascending order I generate a valid DAG. So all
> >stores/fetches etc appear to be executed in the correct order using a
> >preorder descent (with visit marking to prevent duplicate generation)
> >through each tree in the DAG.
>
> >The IL field "il-depends (dependencies for scheduling)" in the paper
> >below I have dispensed with because of this technique. Am I
> >misinterpreting what this field is for?
>
> I think you have it right.
>
> The advantage of your method is that, if you somehow keep the order
> across code selection, you don't need to worry about translating the
> dependence edges from the IL graph to the ML graph.  Another advantage
> is that the result is a valid schedule already, so you could skip
> instruction scheduling (especially on OoO machines where the usual
> scheduling does not buy much).  And some of the algorithms that walk
> the DAG could just visit the nodes sequentially.
>
> The disadvantage is that you now have to work quite a bit more to
> distinguish dependences from incidential ordering.  This might be
> particularly troublesome in cases where the instruction selector may
> need to forego combining several IL nodes because of dependences.

And I'm re-writing, since although I fixed the problem, the resulting
structures left little flexibility and generated spurious dependencies
over an explicit dependency chain, and this approach has made my job
harder.

I'm also discovering that a good IL is important, and that a
manageable BURG demands that it has some features I had ignored; for
example, flags and scheduling dependencies on them.

[toc] | [prev] | [next] | [standalone]

#7714

From	George Hubert <georgeahubert@yahoo.co.uk>
Date	2011-12-04 02:37 -0800
Message-ID	<cb89550a-baba-4eec-94d3-07ab540b62ef@o13g2000vbo.googlegroups.com>
In reply to	#7690

On Dec 2, 4:36 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
wrote:
> Alex McDonald <b...@rivadpm.com> writes:
> >The example shown above would have the stack diagram ( 3 -- 2 ).
>
> Do you mean something like ( x1 x2 x3 -- y1 y2 )?  If so, the stack
> diagram is possible given the other stuff you wrote, but does not
> follow from it (other stack diagrams are possible, too).
>
> >Now, assuming that the code is generated from right to left -- for
> >example, if the stack diagram is ( A B -- C D E )  generate code for E
> >then D then C -- are there any dependencies that the stack entries may
> >have between each other? In other words, if E is generated first, is
> >there any way of D or C containing dependencies on the original value
> >of the stack cell that E will overwrite? I'm ignoring ! (store)
> >dependencies for the moment.
>
> In ( A B -- C D E ), the easiest allocation is to allocate C at the
> same place as A and D at the same place as B, and E at some new place.
> Thener there obviously is no such write-after-read dependence; you
> might introduce one, though, depending on how you handle intermediate
> values (but you don't mean these by "original value", do you?).
>
> - anton

That assumes that the TOS isn't cached in a register. If TOS is cached
then E would be in the same place as B (the register) and D would be
the
'new' location (probably under C since stacks tend to grow downwards).

George Hubert

[toc] | [prev] | [next] | [standalone]

#7724

From	Alex McDonald <blog@rivadpm.com>
Date	2011-12-04 11:13 -0800
Message-ID	<cb8424fb-5b19-4e20-b9be-126c797dd252@h5g2000yqk.googlegroups.com>
In reply to	#7714

On Dec 4, 10:37 am, George Hubert <georgeahub...@yahoo.co.uk> wrote:
> On Dec 2, 4:36 pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
> wrote:
>
>
>
>
>
>
>
>
>
> > Alex McDonald <b...@rivadpm.com> writes:
> > >The example shown above would have the stack diagram ( 3 -- 2 ).
>
> > Do you mean something like ( x1 x2 x3 -- y1 y2 )?  If so, the stack
> > diagram is possible given the other stuff you wrote, but does not
> > follow from it (other stack diagrams are possible, too).
>
> > >Now, assuming that the code is generated from right to left -- for
> > >example, if the stack diagram is ( A B -- C D E )  generate code for E
> > >then D then C -- are there any dependencies that the stack entries may
> > >have between each other? In other words, if E is generated first, is
> > >there any way of D or C containing dependencies on the original value
> > >of the stack cell that E will overwrite? I'm ignoring ! (store)
> > >dependencies for the moment.
>
> > In ( A B -- C D E ), the easiest allocation is to allocate C at the
> > same place as A and D at the same place as B, and E at some new place.
> > Thener there obviously is no such write-after-read dependence; you
> > might introduce one, though, depending on how you handle intermediate
> > values (but you don't mean these by "original value", do you?).
>
> > - anton
>
> That assumes that the TOS isn't cached in a register. If TOS is cached
> then E would be in the same place as B (the register) and D would be
> the
> 'new' location (probably under C since stacks tend to grow downwards).
>
> George Hubert

That's easy to address, since the simulation of the stack generates
"slot numbers" on the stack. How that is mapped to the physical stack
is left up to the implementer.

[toc] | [prev] | [next] | [standalone]

#7766

From	anton@mips.complang.tuwien.ac.at (Anton Ertl)
Date	2011-12-06 17:43 +0000
Message-ID	<2011Dec6.184304@mips.complang.tuwien.ac.at>
In reply to	#7714

George Hubert <georgeahubert@yahoo.co.uk> writes:
>On Dec 2, 4:36=A0pm, an...@mips.complang.tuwien.ac.at (Anton Ertl)
>wrote:
>> In ( A B -- C D E ), the easiest allocation is to allocate C at the
>> same place as A and D at the same place as B, and E at some new place.
>> Thener there obviously is no such write-after-read dependence; you
>> might introduce one, though, depending on how you handle intermediate
>> values (but you don't mean these by "original value", do you?).
>>
>> - anton
>
>That assumes that the TOS isn't cached in a register.

It assumes that E is in a different place than A B C D; that place may
be a register, however.  But yes, if you have a simple convention at
the register allocation boundaries of having the top n items in registers
R1...Rn and the rest in memory, then E will be in the same place as B.

- anton
-- 
M. Anton Ertl  http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
     New standard: http://www.forth200x.org/forth200x.html
   EuroForth 2011: http://www.euroforth.org/ef11/

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

Help with RAFTS-like optimiser

Contents

#7657 — Help with RAFTS-like optimiser

#7690

#7698

#7700

#7701

#7704

#7730

#7763

#7770

#7771

#7781

#8755 — Re: Help with RAFTS-like optimiser (long-ish)

#8757 — Re: Help with RAFTS-like optimiser (long-ish)

#8758 — Re: Help with RAFTS-like optimiser (long-ish)

#8761 — Re: Help with RAFTS-like optimiser (long-ish)

#8770 — Re: Help with RAFTS-like optimiser (long-ish)

#8908 — Re: Help with RAFTS-like optimiser (long-ish)

#7714

#7724

#7766