Path: csiph.com!weretis.net!feeder6.news.weretis.net!feeder.usenetexpress.com!feeder-in1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end From: Kaz Kylheku <157-073-9834@kylheku.com> Newsgroups: comp.compilers Subject: Re: Add nested-function support in a language the based on a stack-machine Date: Sun, 11 Mar 2018 03:19:57 +0000 (UTC) Organization: Aioe.org NNTP Server Lines: 120 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <18-03-037@comp.compilers> References: <18-02-009@comp.compilers> <18-03-002@comp.compilers> <18-03-012@comp.compilers> <18-03-020@comp.compilers> <18-03-024@comp.compilers> <18-03-028@comp.compilers> Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="20540"; mail-complaints-to="abuse@iecc.com" Keywords: code Posted-Date: 11 Mar 2018 05:43:17 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:1987 On 2018-03-06, Kaz Kylheku <217-679-0842@kylheku.com> wrote: > On 2018-03-06, John Levine wrote: >> The other is the same array, but with a copy of the current display in >> each routine's stack frame. This is slower at call time but probably >> faster to use on machines like S/360 without direct addressing >> since the display on the stack is addressable from the frame pointer, >> but static data needs to load a pointer from a constant pool. A >> longjmp doesn't need anything special since stacked displays go away >> when the stack frames go away. > > This is precisely what I happen to be working on at the moment: a > virtual machine for a Lisp dialect in which I plan to have displays > copied into the local stack frame. I expect it to be a performance > advantage that there is nothing special to do when the stack frame goes > away. And ... bam, here is a working demo in less than a week of hacking! Funny we should have had this whole debate around this time. I've been massaging the display-based VM design in my head for some months. I write some design docs regarding the instruction set and fired up Dia to make some bubble and arrow diagrams. In the end, it is turning out to be far simpler than I made it out to be. I have a working implementation of a VM with 32 instructions so far, and an assembler/disassembler for all of them, with backpatch label support. There are some pseudo-ops: a single mov instruction for instance generates three different real instructions based on how the operands fit. Without any more lengthy introduction, here is vm-clos.tl. It implements the simple counting closure in the VM assembly language (there is no compiler yet): (load "asm") (in-package :sys) (let ((a (new assembler)) (d #(succ format t "inner closure, counter: ~s\n"))) a.(asm '((close t1 16 :l1 1 1 nil v0000) (close t1 0 :l1 0 0 nil) (call t1 d1 d2 d3 v0000) ;; (format t ...) (mov t1 v0000) ;; t1 <- v-0-0 (call v0000 d0 v0000) ;; v-0-0 <- (succ v-0-0) :l1 (end t1))) ;; (return t1) a.(dis-listing) (let* ((vd (vm-make-desc 4 32 a.buf d)) ;; a.buf is code (mk-counter (vm-interpret-toplevel vd)) (counter [mk-counter 10])) (prinl [counter]) (prinl [counter]) (prinl [counter]))) Run: $ ./txr vm-clos.tl 0: 7C00000D close t01 16 13 1 1 nil v0000 1: 00100001 2: 00010001 3: 00000200 4: 7C00000D close t01 0 13 0 0 nil 5: 00000001 6: 00000000 7: 14030001 call t01 d01 d02 d03 v0000 8: 01020101 9: 02000103 10: 20010200 movsr t01 v0000 11: 14010200 call v0000 d00 v0000 12: 02000100 13: 0C000001 end t01 inner closure, counter: 10 10 inner closure, counter: 11 11 inner closure, counter: 12 12 The high level overview is that we have an assembly job which produces some code in a.buf. We use this to make a VM description: we tell the VM description that it has 4 levels of display nesting (we only use 3), and that it should allocate 32 registers. We give it the code from a.buf and the data vector d. The vm-interpret-toplevel instantiates the VM state and runs this description. No arguments can be passed in the top-level entry. But it produces a return value. That return value is the outer closure: a one-arg function which instantiates a counter that counts from that argument. When we call the closure, it instantiates a vm state, but with an entry into the indicated closure code, and with argument passing. The close instruction walks the display (on the stack) and clones it into a heaped object. Any levels that are moved from stack to heap are also installed in the original display, since they are shared. There is a way for the machine code to create levels that are not subject to capture. Everything is in the display. Each level has up to 256 entries. Level 0 is the registers t0 through tFF. t0 is always NIL, inspired by MIPS. Level 1 is the data vector, d0 through dFF. The levels after that are vXXYY. For instance v0312 is the fifth level, register 18 (12 hex). I might re-adjust this: say, have only up to 64 levels that are up to 1024 cells wide. I not only have closures working, but unwind-protect and exception handling, and the handling of the dynamic environment for binding special variables. Basically, it appears to be reasonably complete; I can't think of anything missing from being able to support the semantics of the Lisp dialect. Maybe some unexpected strange interactions with the delimited continuation support will crop up. Ah, debugging support. Ha! :)