Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!news.misty.com!news.iecc.com!.POSTED!nerds-end From: BGB Newsgroups: comp.compilers Subject: Re: Compiling Lexical Closures to C Date: Thu, 27 Sep 2012 12:21:01 -0500 Organization: albasani.net Lines: 197 Sender: johnl@iecc.com Approved: comp.compilers@iecc.com Message-ID: <12-09-021@comp.compilers> References: <12-09-020@comp.compilers> NNTP-Posting-Host: news.iecc.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Trace: leila.iecc.com 1348977375 2369 64.57.183.58 (30 Sep 2012 03:56:15 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Sun, 30 Sep 2012 03:56:15 +0000 (UTC) Keywords: code, design Posted-Date: 29 Sep 2012 23:56:15 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:757 On 9/25/2012 6:57 PM, grom358@gmail.com wrote: > Working on a basic compiler for a language that supports nested functions and > full closures. > > I have read a few articles like > http://matt.might.net/articles/compiling-scheme-to-c/ where they using a > structure for the closure, this strategy actually works fairly well... skim article: I don't like tagged unions. it doesn't take long to realize they are a rather poor way to deal with this. personally, I am left with more of a mash-up between doing some magic with the pointers in the MM/GC (the MM/GC keeps track of object types, and uses special address-ranges for some other types), and also using a "signature string" system vaguely more related to the JVM (partly combined with the IA-64 C++ ABI name-mangling as well, and some "custom"-ness. the basic notation and 'concept' resembles that of the JVM signature strings though). normally (but it does happen), the signatures are not directly used in the main execution path, but usually more for "static" types. so: var x; ( typically dynamically typed (when not type-inferred), uses a raw pointer internally, with some types encoded via address ranges ). var x:int; (may use a raw machine integer, may encode type via an "i" signature string). > var createLines = function() { > var m = 2; > var c = 3; > return { > 'line': function(x) { return m * x + c; }, > 'lineSquare': function(x) { return x * x + c; } > }; > }; I recognize this language... or at least the basic syntax. (my own language derives from a similar origin, so has a similar base-syntax as well...). > Now how to pass the free variables? mark which variables are captured, then there can also be case for a (potentially more efficient) closure which doesn't capture anything (basically, a raw code-block). this way, captured variables go in the structure, but non-captured variables can remain in locals. typically, all functions will see the same set of captured variables, but this isn't usually a big issue. > Additionally how to deal with deeper nested functions, for example: usual strategy is to daisy-chain. this tends to be fairly straightforward and can get reasonable performance. the additional indirections needed for deep nesting don't really add much to the cost, if anything, because deep nesting tends to be fairly uncommon anyways. > What other techniques/methods are there for compiling lexical closures > into C? don't have much significant. but, if you are clever, closures can be made to look like normal C function pointers. the downside is that there is no good way to handle this generally (and portably) in plain C, so this is one of those cases that requires assembler to work well. a simple concept for a plain C implementation though is that a person can have an array of "reserve" functions (to be pointers), which call the closure via loading it from a global array. say: MyVM_Closure *closure_array[1024]; int funptr0_i_i(int x) { return closure_array[0]->func(closure_array[0], x); } int funptr1_i_i(int x) { return closure_array[1]->func(closure_array[1], x); } ... int (*funptr_i_i_array[256])(int x)={ funptr0_i_i, funptr1_i_i, funptr2_i_i, ... }; this can work, but has the downside that such a plain C implementation can waste lots of memory. if a person has ASM/machine-code and write/execute memory, a better solution may be to make an executable structure, say: struct MyVM_FunPtrWrap_s { byte jump[64]; MyVM_Closure *close; }; with 'jump' potentially containing essentially a "trampoline" of sorts, say (32-bit x86 using cdecl): call L0 ;get structure address L0: ;label pop eax ;pop address sub eax, 5 ;adjust for call (size of 'call L0') ;non-GCC case: pop ecx ;return EIP push esp ;get pointer to arguments push eax ;structure address push ecx ;return EIP jmp MyVM_ClosureMagicT ;wrap over call ;GCC case (try to keep stack aligned): lea edx, [esp+4] ;try to preserve stack alignment push edx ;pad 1 push edx ;arguments list push eax ;structure address call MyVM_ClosureMagicT ;closure magic add esp, 16 ret the downside of such a strategy is portability: the logic may have to be specialized for every target CPU and ABI (the x86 case is simple, but for SysV/AMD64 may require a bit more nasty glob of code here, as it is essentially necessary to save off all of the registers which may be used to pass arguments). 'ClosureMagicT' above would also be a bit machine specific, and mostly just serve to work the arguments into the form the closure expects and pass control into the closure ('T' can be a stand-in for the return type, note: the logic will be different for some return types, ...). the advantage though is that there is no strict limit to the number of closures which can be created this way, and it doesn't waste a lot of memory with piles of "reserves". getting slightly more clever, the above step can be skipped, instead using logic to transfer directly to the wrapped closure body (this may involve general-case logic to spit out machine code specific to each argument list, making it potentially non-trivial as a general-purpose solution, as well as considerably more complicated). side note: personally, my current language, even though it usually runs in a threaded-code interpreter, tries to at least externally mimic using the C ABI, which makes cross-language interfacing a good deal easier (internally the ABI is a bit different, and technically internal to the interpreter). in my case, being able to easily share code and data between C-land and script code is a major use-case for my language. I try for the goal of making it more like the C <-> C++ interface, namely, it exists but for the most part it isn't too terrible (in contrast to the Java <-> C interface, which has the general awfulness that is JNI and JNA, where cross-language interfacing tends to require a small mountain of boilerplate...). currently, my interpreter is "mostly plain C", but still has a few optional ASM parts/augments (the mostly-plain-C route is mostly so that I am not entirely tied to x86...). I had considered potentially going and rewriting a big chunk of the threaded-code interpreter core into ASM to potentially allow it to be faster on certain targets (namely x86), but never got around to it (more important stuff exists at the moment, and it is harder to justify the effort and added maintenance hassle for this...). this leaves most of the ASM magic mostly devoted to handling calls into and out-of C land, and ironically, not so much about making the core interpreter fast... yeah, I have had JITs before, but most have not been maintained well thus-far.