Path: csiph.com!newsfeed.hal-mli.net!feeder3.hal-mli.net!newsfeed.hal-mli.net!feeder1.hal-mli.net!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!news.iecc.com!nerds-end From: BGB Newsgroups: comp.compilers Subject: Re: Thoughts on the JVM as a compilation Target? Date: Fri, 25 May 2012 14:26:11 -0500 Organization: albasani.net Lines: 187 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <12-05-024@comp.compilers> References: <12-05-015@comp.compilers> <12-05-016@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: leila.iecc.com 1338057463 43608 64.57.183.58 (26 May 2012 18:37:43 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Sat, 26 May 2012 18:37:43 +0000 (UTC) Keywords: Java Posted-Date: 26 May 2012 14:37:43 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: csiph.com comp.compilers:653 On 5/24/2012 3:30 AM, Jeremy Wright wrote: > Micro Focus has a COBOL compiler that compiles direct to JVM byte code. > > Last year's The Server Side Java Symposium had a panel discussion "Who Invited > All These Other Languages to My JVM?" but I can't find a write up anywhere. yeah. in general though, single-language VMs may be slowly coming to an end (along with hopefully the "one language to rule them all" developer mindset, but alas, for humans this one may be an intrinsic feature). > To answer the specific question, the lack of "unsigned" can be inconvenient. among other things. (others may disagree on things here, I don't make any claim to be an expert on the JVM here). maybe trying for a small list here: lack of unsigned operations; lack of pointers; lack of function or method references; lack of variable references (need not be pointers); lack of a lexical environment; lack of good dynamic types; lack of package-scoped declarations; lack of operators over object types; lack of pass-by-value / structure types; lack of RAII or similar; lack of a good C FFI. lack of unsigned operations: wouldn't need its own types, but a few additional integer operations would be sufficient ("idivu", "imodu", "ldivu", "lmodu", ...). lack of pointers: this is more complex, but pointers are not outside what can likely be handled by a verifier. if the pointer does something which can't be validated or is clearly wrong, then the code is rejected. potentially, some other cases could be handled using run-time bounds-checks (similar to arrays), say the pointer is not allowed to "leave" the object it points into, ... the verifier could also disallow using pointers to spoof references (not allowing any writes of a non-reference value into a reference). lack of function or method references: this is sort of done by the newer "invokedynamic" mechanism (JDK 1.7), although IIRC these handles are kept internal, and are still not directly usable as first-class values. lack of variable references: obvious enough, there are cases where things like this could be useful. lack of a lexical environment: this one is a bit of an issue IMO, as there is no good way I know of to do lexical scoping in the JVM without it being rather costly. it can be faked though, and one at least passable mechanism could be to use arrays (any shared/captured variables are held in these arrays). objects are also possible (and could be higher performance), but then the compiler is likely to spit out a large number of inner classes in this case. note that although Java has recently added lambdas, they are implemented via a trick, namely classes with "SAM" ("Single Abstract Method") interfaces, and all captured bindings are "implicitly final" (the values are captured and stored into object fields). another use case of lexical environments though is to implement a method which may have parts of its body folded off into other code-blocks (useful for implementing things like "ifdef" style mechanisms or similar, which may need to share bindings but don't actually need to "capture" them). lack of good dynamic types: although most dynamically-typed operations can be done via "Object", it is not particularly efficient. lack of "fixnum" and "flonum" types is a notable issue here, as allocating large numbers of "Integer" or "Double" objects on the heap can easily become horridly expensive. apparently Oracle is working on this (experimental JVMs have added fixnum support). lack of package-scoped declarations: this one does have some reason, in that it would likely require a partial redesign of the class file-format and loader to make it work well (making the "class" files "contain" classes, rather than "be" classes). one example here would be if most of the currently fixed structures were relocated into the constant pool (with multiple classes in a class file, and possibly with special "package" classes for holding package-scoped declarations). a cheap trick here is, of course, to create a hidden "default" class or similar for the package, which basically contains any package-scoped declarations, but this is kind of ugly. lack of operators over object types: this one is a bit of a hassle IMO, although it can be argued that, say: aadd "Lfoo/Bar;Lfoo/Baz;" offers little over something like, say: invokestatic "foo/Bar/operator_add(Lfoo/Bar;Lfoo/Baz;)Lfoo/Bar;" but still... lack of pass-by-value / structure types: though to be fair, this one is apparently on Oracle's "planned features" list. as-is, this is a bit of annoyance for wanting to extend the basic numeric tower without taking a performance hit. lack of RAII or similar: enough said, besides just reclaiming the memory of an object, it may be useful to also be able to handle logic for when it goes out of scope. this can be faked though, such as via generating implicit "finally" blocks, and generating explicit destructor calls for when the object leaves scope, but this is lame and brittle IMO. nevermind if the JVM had some sort of explicit "delete" operation (basically so the compiler can give a hint to the VM that it no longer needs the object). lack of a good C FFI: apparently also on Oracle's to-do list. admittedly, JNI and JNA are so bad I was almost left to wonder if Sun originally did it intentionally. I have little idea what exactly Oracle's plan here is exactly. ... also, although arguably the JVM is fairly high-performance by VM standards, it still tends to run a bit slower than what is possible via code written in C, which in turn is slower than what is possible using ASM. usually, optimized Java code can compete with naive C code, and optimized C code can compete with naive ASM (and well optimized ASM beats pretty much anything). but, it isn't really a major difference. the main reason is not so much due to elaborate optimizations, but more often what doesn't need to be optimized away in the first place. note that an app written directly in ASM may actually easily be slower than one written in C or Java, partly due to the limitations of a human programmer. this is less true of C vs Java, as the actual difference in terms of "level of abstraction" is likely much smaller than it is often made out to be. an advantage though that the JVM has is that, yes, it is more portable than targeting ASM. also, it may make more sense if one is operating more within the "Java landscape", rather than starting from the "C and C++ landscape". also, although C is not widely binary portable, code portability usually works "well enough". I personally like .NET a little more than the JVM, but it introduced some of its own problems and limitations, and presently has a lack in terms of being as widely or effectively implemented as the JVM. there is also the AVM2 (AKA: Adobe Flash), which despite being an interesting piece of technology, seems to have a less certain future. as I see it, there is no ideal solution at present. so, pick something and run with it I guess. personally, I just stay within native-land, as personally this better matches what I am doing.