Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.lightlink.com!news.iecc.com!nerds-end From: Christophe de Dinechin Newsgroups: comp.compilers Subject: Re: Need an interesting topic for an undergraduate project on Compilers Date: Sun, 4 Sep 2011 23:42:48 -0700 (PDT) Organization: Compilers Central Lines: 81 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-09-008@comp.compilers> References: <11-08-006@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: gal.iecc.com 1315361559 55164 64.57.183.58 (7 Sep 2011 02:12:39 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Wed, 7 Sep 2011 02:12:39 +0000 (UTC) Keywords: courses Posted-Date: 06 Sep 2011 22:12:39 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Xref: x330-a1.tempe.blueboxinc.net comp.compilers:258 On Aug 6, 7:28 pm, amit karmakar wrote: > I would like to have some suggestions as to what *new* and > *innovative* project i can do which are based on compiler design. Innovation in compilers can happen at a number of levels : 1. Parsing techniques, grammars, etc. Very active research a while back, considered (erroneously methinks) as dead by most today, who happily use flex/bison and don't think twice about it. 2. Language design. One of the active areas these days is "domain specific languages" or DSLs, i.e. languages designed for one specific need. Often using "meta-programming" techniques (programs that generate programs) 3. Type systems, proofs, program validation. Languages like Haskell use type inference, so that you don't have to specify types yourself most of the time. C++ recently gained the "auto" keyword for types. DSLs pose a new class of interesting problems in that space. 4. Intermediate representations, code generation and optimization frameworks. The king of this hill these days IMO is LLVM. But there are a number of contenders. If you are interested in optimizations, that's the right place to look at. 5. Runtime support : garbage collectors, just-in-time code generation, parallel execution, use of new hardware such as GPUs,  6. Support for innovative hardware, hardware generation, hardware/ software co-design, etc. If you are more into silicon, this is a very interesting are to learn about. My own pet project, XLR (http://xlr.sf.net) offers a number of innovations in the first three of these areas. It is a language designed to grow with the user, i.e. the objective is to make it as easy to add language constructs as it is to add, say, functions or classes in other languages. Regarding parsing, it generates a parse tree made of exactly 8 nodes : integer, real, text and name/symbol represent leaves of the tree, infix, prefix, postfix and block represent inner nodes. This makes it possible to write programs in a very natural-looking way, yet with an internal program representation that is easy to manipulate. This is the foundation of XL meta-programming / DSL capabilities. To validate that, XLR has practically no built-in constructs. It has constructs to connect to LLVM primitives, constructs to connect to C code, and a pair of "rewrite" constructs, notably ->, to transform one tree shape into another. For example : extern bool puts(text); (x:integer - y:integer):integer -> opcode Sub repeat 0, B -> true repeat N, B -> B; repeat N-1, B repeat 25, puts "Hello" You can check the code generated for the above with xlr -tcode -O3 tests/09.Compiler/optimized-repeat-loop.xl. LLVM actually turns it into a sequence of 25 calls to puts, you can hardly do better. The most active area of research for XLR these days is its type system. In order to generate efficient code, an Haskell-like type inference mechanism is in place. But the standard type inference algorithms must be extended, because there are a few additional transformations compared to lambda calculus (not just "alpha" and "beta"), and the closest there is to a type is the shape of a tree (e.g. "if X then Y else Z"). Since it uses LLVM, it is also an interesting way to learn a little about LLVM, but it's not intended as an LLVM tutorial. So if you are interested in experimenting with "growing a language" in a text-based framework, XLR is the right way to go. There are other projects that are more advanced e.g. if you want to build the IDE at the same time, see for example JetBrain's Meta Programming System. But they are not as strong in language development per-se, I believe.