Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #642 > unrolled thread
| Started by | "Aaron W. Hsu" <arcfide@sacrideo.us> |
|---|---|
| First post | 2012-05-22 13:33 -0400 |
| Last post | 2012-05-24 13:06 -0500 |
| Articles | 9 — 6 participants |
Back to article view | Back to comp.compilers
Thoughts on the JVM as a compilation Target? "Aaron W. Hsu" <arcfide@sacrideo.us> - 2012-05-22 13:33 -0400
Re: Thoughts on the JVM as a compilation Target? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-05-24 04:21 +0000
Re: Thoughts on the JVM as a compilation Target? Jeremy Wright <jeremy.wright@microfocus.com> - 2012-05-24 08:30 +0000
Re: Thoughts on the JVM as a compilation Target? BGB <cr88192@hotmail.com> - 2012-05-25 14:26 -0500
Re: Thoughts on the JVM as a compilation Target? torbenm@diku.dk (Torben Ægidius Mogensen) - 2012-05-29 17:40 +0200
Re: Thoughts on the JVM as a compilation Target? "Aaron W. Hsu" <arcfide@sacrideo.us> - 2012-05-24 10:05 -0400
Re: Thoughts on the JVM as a compilation Target? torbenm@diku.dk (Torben Ægidius Mogensen) - 2012-05-24 10:34 +0200
Re: Thoughts on the JVM as a compilation Target? "lpsantil@gmail.com" <lpsantil@gmail.com> - 2012-05-24 10:39 -0700
Re: Thoughts on the JVM as a compilation Target? BGB <cr88192@hotmail.com> - 2012-05-24 13:06 -0500
| From | "Aaron W. Hsu" <arcfide@sacrideo.us> |
|---|---|
| Date | 2012-05-22 13:33 -0400 |
| Subject | Thoughts on the JVM as a compilation Target? |
| Message-ID | <12-05-013@comp.compilers> |
Hey folks: What are your thoughts on JVM as a compilation target, especially with new languages all targeting high performance or multi-core type features? -- Aaron W. Hsu | arcfide@sacrideo.us | http://www.sacrideo.us Programming is just another word for the lost art of thinking.
[toc] | [next] | [standalone]
| From | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
|---|---|
| Date | 2012-05-24 04:21 +0000 |
| Message-ID | <12-05-015@comp.compilers> |
| In reply to | #642 |
Aaron W. Hsu <arcfide@sacrideo.us> wrote: > What are your thoughts on JVM as a compilation target, > especially with new languages all targeting high > performance or multi-core type features? Some year ago I was wondering about JVM as a target for a C compiler. It is a little tricky, as there are some things that people expect even though the C standard doesn't require them. I once knew about a COBOL compiler for JVM, though I don't know any more than that. What language(s) were you thinking about? -- glen
[toc] | [prev] | [next] | [standalone]
| From | Jeremy Wright <jeremy.wright@microfocus.com> |
|---|---|
| Date | 2012-05-24 08:30 +0000 |
| Message-ID | <12-05-016@comp.compilers> |
| In reply to | #644 |
Micro Focus has a COBOL compiler that compiles direct to JVM byte code. Last year's The Server Side Java Symposium had a panel discussion "Who Invited All These Other Languages to My JVM?" but I can't find a write up anywhere. To answer the specific question, the lack of "unsigned" can be inconvenient. Jeremy
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2012-05-25 14:26 -0500 |
| Message-ID | <12-05-024@comp.compilers> |
| In reply to | #645 |
On 5/24/2012 3:30 AM, Jeremy Wright wrote:
> Micro Focus has a COBOL compiler that compiles direct to JVM byte code.
>
> Last year's The Server Side Java Symposium had a panel discussion "Who Invited
> All These Other Languages to My JVM?" but I can't find a write up anywhere.
yeah.
in general though, single-language VMs may be slowly coming to an end
(along with hopefully the "one language to rule them all" developer
mindset, but alas, for humans this one may be an intrinsic feature).
> To answer the specific question, the lack of "unsigned" can be inconvenient.
among other things.
(others may disagree on things here, I don't make any claim to be an
expert on the JVM here).
maybe trying for a small list here:
lack of unsigned operations;
lack of pointers;
lack of function or method references;
lack of variable references (need not be pointers);
lack of a lexical environment;
lack of good dynamic types;
lack of package-scoped declarations;
lack of operators over object types;
lack of pass-by-value / structure types;
lack of RAII or similar;
lack of a good C FFI.
lack of unsigned operations:
wouldn't need its own types, but a few additional integer operations
would be sufficient ("idivu", "imodu", "ldivu", "lmodu", ...).
lack of pointers:
this is more complex, but pointers are not outside what can likely be
handled by a verifier. if the pointer does something which can't be
validated or is clearly wrong, then the code is rejected. potentially,
some other cases could be handled using run-time bounds-checks (similar
to arrays), say the pointer is not allowed to "leave" the object it
points into, ... the verifier could also disallow using pointers to
spoof references (not allowing any writes of a non-reference value into
a reference).
lack of function or method references:
this is sort of done by the newer "invokedynamic" mechanism (JDK 1.7),
although IIRC these handles are kept internal, and are still not
directly usable as first-class values.
lack of variable references:
obvious enough, there are cases where things like this could be useful.
lack of a lexical environment:
this one is a bit of an issue IMO, as there is no good way I know of to
do lexical scoping in the JVM without it being rather costly. it can be
faked though, and one at least passable mechanism could be to use arrays
(any shared/captured variables are held in these arrays). objects are
also possible (and could be higher performance), but then the compiler
is likely to spit out a large number of inner classes in this case.
note that although Java has recently added lambdas, they are implemented
via a trick, namely classes with "SAM" ("Single Abstract Method")
interfaces, and all captured bindings are "implicitly final" (the values
are captured and stored into object fields).
another use case of lexical environments though is to implement a method
which may have parts of its body folded off into other code-blocks
(useful for implementing things like "ifdef" style mechanisms or
similar, which may need to share bindings but don't actually need to
"capture" them).
lack of good dynamic types:
although most dynamically-typed operations can be done via "Object", it
is not particularly efficient.
lack of "fixnum" and "flonum" types is a notable issue here, as
allocating large numbers of "Integer" or "Double" objects on the heap
can easily become horridly expensive.
apparently Oracle is working on this (experimental JVMs have added
fixnum support).
lack of package-scoped declarations:
this one does have some reason, in that it would likely require a
partial redesign of the class file-format and loader to make it work
well (making the "class" files "contain" classes, rather than "be"
classes). one example here would be if most of the currently fixed
structures were relocated into the constant pool (with multiple classes
in a class file, and possibly with special "package" classes for holding
package-scoped declarations).
a cheap trick here is, of course, to create a hidden "default" class or
similar for the package, which basically contains any package-scoped
declarations, but this is kind of ugly.
lack of operators over object types:
this one is a bit of a hassle IMO, although it can be argued that, say:
aadd "Lfoo/Bar;Lfoo/Baz;"
offers little over something like, say:
invokestatic "foo/Bar/operator_add(Lfoo/Bar;Lfoo/Baz;)Lfoo/Bar;"
but still...
lack of pass-by-value / structure types:
though to be fair, this one is apparently on Oracle's "planned features"
list.
as-is, this is a bit of annoyance for wanting to extend the basic
numeric tower without taking a performance hit.
lack of RAII or similar:
enough said, besides just reclaiming the memory of an object, it may be
useful to also be able to handle logic for when it goes out of scope.
this can be faked though, such as via generating implicit "finally"
blocks, and generating explicit destructor calls for when the object
leaves scope, but this is lame and brittle IMO.
nevermind if the JVM had some sort of explicit "delete" operation
(basically so the compiler can give a hint to the VM that it no longer
needs the object).
lack of a good C FFI:
apparently also on Oracle's to-do list.
admittedly, JNI and JNA are so bad I was almost left to wonder if Sun
originally did it intentionally. I have little idea what exactly
Oracle's plan here is exactly.
...
also, although arguably the JVM is fairly high-performance by VM
standards, it still tends to run a bit slower than what is possible via
code written in C, which in turn is slower than what is possible using ASM.
usually, optimized Java code can compete with naive C code, and
optimized C code can compete with naive ASM (and well optimized ASM
beats pretty much anything). but, it isn't really a major difference.
the main reason is not so much due to elaborate optimizations, but more
often what doesn't need to be optimized away in the first place.
note that an app written directly in ASM may actually easily be slower
than one written in C or Java, partly due to the limitations of a human
programmer. this is less true of C vs Java, as the actual difference in
terms of "level of abstraction" is likely much smaller than it is often
made out to be.
an advantage though that the JVM has is that, yes, it is more portable
than targeting ASM. also, it may make more sense if one is operating
more within the "Java landscape", rather than starting from the "C and
C++ landscape".
also, although C is not widely binary portable, code portability usually
works "well enough".
I personally like .NET a little more than the JVM, but it introduced
some of its own problems and limitations, and presently has a lack in
terms of being as widely or effectively implemented as the JVM.
there is also the AVM2 (AKA: Adobe Flash), which despite being an
interesting piece of technology, seems to have a less certain future.
as I see it, there is no ideal solution at present.
so, pick something and run with it I guess.
personally, I just stay within native-land, as personally this better
matches what I am doing.
[toc] | [prev] | [next] | [standalone]
| From | torbenm@diku.dk (Torben Ægidius Mogensen) |
|---|---|
| Date | 2012-05-29 17:40 +0200 |
| Message-ID | <12-05-027@comp.compilers> |
| In reply to | #653 |
BGB <cr88192@hotmail.com> writes: > maybe trying for a small list here: > lack of unsigned operations; > lack of pointers; > lack of function or method references; > lack of variable references (need not be pointers); > lack of a lexical environment; > lack of good dynamic types; > lack of package-scoped declarations; > lack of operators over object types; > lack of pass-by-value / structure types; > lack of RAII or similar; > lack of a good C FFI. A few more: - Lack of proper tail calls. This can be implemented using trampolines, but this is inefficient and kludgy. Scala (AFAIR) implements only tail recursion and not proper tail calls because of this limitation. - Lack of non-nullable reference types. Object/reference types are implicitly always nullable, though it is simple to statically verify non-nullable types. The consequence is that the VM always has to check for null pointers when following references, even when the reference can never be to null. - Lack of structural type equivalence. This means that you have to resort to kludges when implementing pair types and similar structures, and it gives problems when implementing (type-safe) polymorphic pair types. Which leads to - Lack of parametric polymorphism. Generics is currently implemented with type erasure to Object and (runtime-checked) downcasts, which is inefficient. Statically verified parametric polymorphism could avoid this. - Lack of an unbounded integer type. Though this can be implemented using JVM primitives, it is slow and kludgy to do so. Many languages (Scheme, Haskell, ...) have unbounded integers. - Inefficient exception handling. This was a real limitation for some students that tried to implement a subset of Prolog in JVM. Exceptions were the natural way to implement cut (!), but it was just way too slow. This is what I could think off at the top of my head, but I'm sure more would come up if I tried to use JVM as target for a realistically-sized language (as opposed to toy languages). Torben
[toc] | [prev] | [next] | [standalone]
| From | "Aaron W. Hsu" <arcfide@sacrideo.us> |
|---|---|
| Date | 2012-05-24 10:05 -0400 |
| Message-ID | <12-05-018@comp.compilers> |
| In reply to | #644 |
glen herrmannsfeldt wrote: > What language(s) were you thinking about? It's a new research language for studying optimizations in parallel programming. -- Aaron W. Hsu | arcfide@sacrideo.us | http://www.sacrideo.us Programming is just another word for the lost art of thinking.
[toc] | [prev] | [next] | [standalone]
| From | torbenm@diku.dk (Torben Ægidius Mogensen) |
|---|---|
| Date | 2012-05-24 10:34 +0200 |
| Message-ID | <12-05-017@comp.compilers> |
| In reply to | #642 |
"Aaron W. Hsu" <arcfide@sacrideo.us> writes: > What are your thoughts on JVM as a compilation target, especially with new > languages all targeting high performance or multi-core type features? JVM is O.K. as a target if you want to compile an object-oriented statically-typed language, since this is what the VM assumes. For other languages, you will have to jump through hoops to express the semantics of your language in terms of the JVM primitives. Since JVM is Turing-complete, it is, of course, possible to do so. But it won't be pretty and likely not very efficient. JVM, however, has the huge advantage of being implemented on a lot of different platforms, and when new platforms appear, it is likely that these will soon support JVM, so you don't have to worry about porting your language to new platforms. There are also a large standard library that you can use, but unless your language is an object-oriented statically-typed language, these libraries may be hard to use effectively from programs written in your language. Torben
[toc] | [prev] | [next] | [standalone]
| From | "lpsantil@gmail.com" <lpsantil@gmail.com> |
|---|---|
| Date | 2012-05-24 10:39 -0700 |
| Message-ID | <12-05-019@comp.compilers> |
| In reply to | #642 |
> What are your thoughts on JVM as a compilation target, especially with new > languages all targeting high performance or multi-core type features? NestedVM (http://nestedvm.ibex.org/) has some tech to target GCC to JVM by compiling code to MIPS ISA, implementing a MIPS VM, and a MIPS libc/system call to JNI interface. Some wild concepts that actually work well (http://www.zentus.com/sqlitejdbc/) and are starting to be used by others like emscripten (https://github.com/kripken/emscripten , https://github.com/kripken , https://github.com/kripken/emscripten/wiki), pdf.js (https://github.com/mozilla/pdf.js) -L
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2012-05-24 13:06 -0500 |
| Message-ID | <12-05-020@comp.compilers> |
| In reply to | #642 |
On 5/22/2012 12:33 PM, Aaron W. Hsu wrote: > Hey folks: > > What are your thoughts on JVM as a compilation target, especially with new > languages all targeting high performance or multi-core type features? it likely depends on the language. Java-like languages can be targeted to the JVM fairly well (especially if they have less features than Java, or are basically just "Java with certain syntactic differences" or similar). some high-level scripting languages can passably compiled to the JVM, and some newer VM features (such as "invokedynamic") allow making this a little more efficient (more on-par performance-wise with natively compiled versions of the VMs). for much else, the JVM's architecture may be seriously deficient in some areas, and trying to target code to it is likely to be a major pain (vs, say, targeting C, or targeting x86 or x86-64 ASM). (consider, for example, the pain of trying to compile something like C or C++ to the JVM). this is partly due to the level of abstraction: in some ways, it is too high-level, offering only a narrowly defined set of abstractions, which have to be crufted around to build new features (features need to be built using classes, interfaces, and method calls). in other ways, it is too low-level, offering no way to work around arbitrary limitations in the type-system or scoping-model apart from falling back to high-level mechanisms (such as method calls or reflection, ...). some other cases are due to arbitrary limitations, many related to design choices made in the Java language (such as the lack of package-scoped functions or variables, lack of either first-class methods or a way to refer to methods indirectly, ...). yes, the JVM does have some fairly elaborate optimizations, but when they are mostly being used to try to optimize around cruft introduced in trying to fit the language onto the VM, this is not nearly so good. a lot of these issues could be addressed (and some are apparently being addressed), but Oracle is very slow-moving about it. I am feeling rather uncertain as to getting more specific about things I would change to the JVM architecture if given the choice, as some people are likely to get rather defensive (note: most were related to the class file-format and bytecode, note that likely even if a new class-format and bytecode were introduced, it need not break backwards compatibility). most changes in general would be related to increasing generality and orthogonality, and likely introducing more of a "middle layer" regarding bytecode abstraction. personally though (at least for now), I would just assume targeting native code (or maybe targeting C, if writing a static compiler). in my case, my front-ends tend to compile into bytecode-based formats, which are then either interpreted, or could be fed through a JIT or similar. as is, my mainly-used bytecode VM currently uses an interpreter which converts the bytecode into threaded code and then runs the program as threaded code. sadly, this bytecode is itself far from perfect: lots of stale instructions, non-optimal organization, operation types are typically encoded as prefixes, ...
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web