Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #223 > unrolled thread
| Started by | amit karmakar <amit.codename13@gmail.com> |
|---|---|
| First post | 2011-08-06 10:28 -0700 |
| Last post | 2011-09-04 23:42 -0700 |
| Articles | 17 — 13 participants |
Back to article view | Back to comp.compilers
Need an interesting topic for an undergraduate project on Compilers amit karmakar <amit.codename13@gmail.com> - 2011-08-06 10:28 -0700
Re: Need an interesting topic for an undergraduate project on Compilers Volker Birk <bumens@dingens.org> - 2011-08-06 19:08 +0000
Re: Need an interesting topic for an undergraduate project on Compilers jgk@panix.com (Joe keane) - 2011-08-27 15:30 +0000
Re: Need an interesting topic for an undergraduate project on Compilers BGB <cr88192@hotmail.com> - 2011-08-31 10:41 -0700
Re: Need an interesting topic for an undergraduate project on Compilers BGB <cr88192@hotmail.com> - 2011-09-01 03:37 -0700
Re: Need an interesting topic for an undergraduate project on Compilers George Neuner <gneuner2@comcast.net> - 2011-08-31 21:01 -0400
Re: Need an interesting topic for an undergraduate project on Compilers Philip Herron <redbrain@gcc.gnu.org> - 2011-09-03 06:43 +0100
Re: Need an interesting topic for an undergraduate project on Compilers "C. Bergström" <cbergstrom@pathscale.com> - 2011-09-03 15:38 +0700
Re: Need an interesting topic for an undergraduate project on Compilers George Neuner <gneuner2@comcast.net> - 2011-09-03 16:00 -0400
Re: Need an interesting topic for an undergraduate project on Compilers torbenm@diku.dk (Torben Ægidius Mogensen) - 2011-08-31 11:15 +0200
Re: Need an interesting topic for an undergraduate project on Compilers Volker Birk <bumens@dingens.org> - 2011-08-31 10:02 +0000
Re: Need an interesting topic for an undergraduate project on Compilers BGB <cr88192@hotmail.com> - 2011-08-06 14:10 -0700
Re: Need an interesting topic for an undergraduate project on Compilers "BartC" <bc@freeuk.com> - 2011-08-09 12:06 +0100
Re: Need an interesting topic for an undergraduate project on Compilers Gene <gene.ressler@gmail.com> - 2011-08-10 00:53 -0700
Re: Need an interesting topic for an undergraduate project on Compilers Hans Aberg <haberg-news@telia.com> - 2011-08-10 23:56 +0200
Re: Need an interesting topic for an undergraduate project on Compilers tm <thomas.mertes@gmx.at> - 2011-08-30 23:17 -0700
Re: Need an interesting topic for an undergraduate project on Compilers Christophe de Dinechin <christophe.de.dinechin@gmail.com> - 2011-09-04 23:42 -0700
| From | amit karmakar <amit.codename13@gmail.com> |
|---|---|
| Date | 2011-08-06 10:28 -0700 |
| Subject | Need an interesting topic for an undergraduate project on Compilers |
| Message-ID | <11-08-006@comp.compilers> |
Hi, I am an undergraduate in computer science. I have been reading about compilers recently. I wish to do a project as a part of my study course in a time span of 6 months. I am pretty much aware of the fact that projects on compilers require much time. I would like to have some suggestions as to what *new* and *innovative* project i can do which are based on compiler design. Also, considering the time i have to implement the compiler, i can think of cutting down work, like working on subset of a language. I would preferably not tend to work on only a specific part(phase) of compiler. It will be better if I implement a complete compiler for some architecture and see the executable running. Thanks in advance, Please bear with my bad english.
[toc] | [next] | [standalone]
| From | Volker Birk <bumens@dingens.org> |
|---|---|
| Date | 2011-08-06 19:08 +0000 |
| Message-ID | <11-08-007@comp.compilers> |
| In reply to | #223 |
amit karmakar <amit.codename13@gmail.com> wrote: > I am an undergraduate in computer science. I have been reading about > compilers recently. I wish to do a project as a part of my study > course in a time span of 6 months. I am pretty much aware of the fact > that projects on compilers require much time. That's not true anymore for every project. As an example, the first version of this easy compiler I was implementing in just three work days by copying the grammar out of the PDF with the standard: http://www.x-pie.de/iec2xml/ After those three days it was used productively in a project. And you can learn how to do so, too. > I would like to have some suggestions as to what *new* and > *innovative* project i can do which are based on compiler design. > Also, considering the time i have to implement the compiler, i can > think of cutting down work, like working on subset of a language. I > would preferably not tend to work on only a specific part(phase) of > compiler. It will be better if I implement a complete compiler for > some architecture and see the executable running. If you want to compile a programming language to assembler code, then I'm recommending to use a PEG based parsing toolchain, and some functional language or dynamical scripting language to play around. This is making things much more easy for you to learn. And this is good enough to learn how to do it in more statically languages, too, because the principles will be the same. Yours, VB.
[toc] | [prev] | [next] | [standalone]
| From | jgk@panix.com (Joe keane) |
|---|---|
| Date | 2011-08-27 15:30 +0000 |
| Message-ID | <11-08-029@comp.compilers> |
| In reply to | #224 |
Volker Birk <bumens@dingens.org> writes: >If you want to compile a programming language to assembler code I'm not sure why anyone would want to do this; if you have another language you can convert it to C code, and concentrate on what you are doing. If you find that the C->assembly step can be improved that is also useful.
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2011-08-31 10:41 -0700 |
| Message-ID | <11-08-030@comp.compilers> |
| In reply to | #246 |
On 8/27/2011 8:30 AM, Joe keane wrote: > Volker Birk<bumens@dingens.org> writes: >> If you want to compile a programming language to assembler code > > I'm not sure why anyone would want to do this; if you have another > language you can convert it to C code, and concentrate on what you are > doing. If you find that the C->assembly step can be improved that is > also useful. there are a few drawbacks to compiling to C though: one can only compile to C in contexts where they have a C compiler, which largely rules out things like JIT compilers; the time-to-comile may be somewhat worse (in cases where this is important); some language features are difficult to implement effectively in C (and standard C is annoyingly lacking in reflection features); ... so, it is not so clear-cut that one wouldn't want to compiler to ASM (or to some bytecode, which is run through a JIT). [Do you mean compiling to assembler, or compiling to machine code? They're different. -John]
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2011-09-01 03:37 -0700 |
| Message-ID | <11-09-002@comp.compilers> |
| In reply to | #247 |
On 8/31/2011 10:41 AM, BGB wrote: > On 8/27/2011 8:30 AM, Joe keane wrote: >> Volker Birk<bumens@dingens.org> writes: >>> If you want to compile a programming language to assembler code >> >> I'm not sure why anyone would want to do this; if you have another >> language you can convert it to C code, and concentrate on what you are >> doing. If you find that the C->assembly step can be improved that is >> also useful. > > there are a few drawbacks to compiling to C though: > one can only compile to C in contexts where they have a C compiler, > which largely rules out things like JIT compilers; > the time-to-comile may be somewhat worse (in cases where this is important); > some language features are difficult to implement effectively in C (and > standard C is annoyingly lacking in reflection features); > ... > > so, it is not so clear-cut that one wouldn't want to compiler to ASM (or > to some bytecode, which is run through a JIT). > [Do you mean compiling to assembler, or compiling to machine code? They're > different. -John] ok, fair enough... pardon my very ambiguous/imprecise wording... in my project, I first compile to ASM, and then feed this though an in-program assembler, and feed the output from this into an in-program linker (links the code against/into the running program image). in this case, one produces assembly code, and gets machine code. in my brevity I had sort of glossed over this detail. notably though, the size, complexity, and performance overhead of an assembler is generally small enough that one can ignore that this stage exists. mine can (depending on settings and usage patter), pull off assembling up to around 10-20MB/s of ASM code on an AthlonII at 2.8GHz. in practice, this is likely more than enough to be needed for most cases. OTOH, a C compiler can be a good deal more expensive than an assembler, both in terms of size and implementation complexity, and in terms of performance (what may take, say, 10ms to compile/assemble directly, could take several seconds or more to compile in a C compiler). also, where the assembler may need very little memory for internal state, whereas a C compiler may end up including several MB worth of header contents, and use up many 10s of MB worth of memory (parsed header contents and internal structures, ...). these (among many other reasons) were why I eventually came to the opinion that while C is fairly good as a statically compiled language, it is not nearly so good of a choice as a scripting language. and isn't really as-is suitable for interactive entry (by the time one has interactive entry, the language is no longer strictly C either). using it as the intermediate form for another scripting language would likely be a good deal worse. yes, I am ignoring here the possibility of invoking GCC or MSVC via "system();" or similar, and fetching/loading the results, which is a fairly simple option, but it is problematic due to both target portability issues, requiring the compiler to be available and in the path (a much bigger issue on Windows), and still does really not really address the "time to compile" issue. but, as noted, if one just wants to statically compile something, and has no good reason to worry about compiler performance, well then, using C as an IL makes a fair amount of sense. there is also LLVM/Clang, but I haven't really messed with it enough to really provide an accurate opinion on it (I don't personally use it, as there isn't really a whole lot in common between my projects and LLVM, technically or goal-wise).
[toc] | [prev] | [next] | [standalone]
| From | George Neuner <gneuner2@comcast.net> |
|---|---|
| Date | 2011-08-31 21:01 -0400 |
| Message-ID | <11-09-003@comp.compilers> |
| In reply to | #247 |
On Wed, 31 Aug 2011 10:41:43 -0700, BGB <cr88192@hotmail.com> wrote: >one can only compile to C in contexts where they have a C compiler, >which largely rules out things like JIT compilers; There are a few embeddable runtime C compilers and/or interpreters. For example: http://bellard.org/tcc/ http://www.softintegration.com/ And there are many assembler generation libraries. George
[toc] | [prev] | [next] | [standalone]
| From | Philip Herron <redbrain@gcc.gnu.org> |
|---|---|
| Date | 2011-09-03 06:43 +0100 |
| Message-ID | <11-09-004@comp.compilers> |
| In reply to | #253 |
I would recommend the experience of implementing a front-end in GCC its a great area to start implementing a language and you can concentrate on more specificity what you want to do. We are really starting to get a lot of cool documentation but we have yet to try and get it merged into mainline but you can contact us through the mailing list for the links to that. Gccgo/gfortran are really good reference front-ends my front-end gccpy is quite complicated. But if you've never written a compiler from scratch i would recommend spending like a week just implementing some really basic language and outputting to some target code to really get a feel in the nitty gritty what it is to work with compilers otherwise i dont think you would really understand why certain aspects work the way they do. When it comes to "new and innovative" i think you need to work on compilers for a bit to understand what that is on your own. Since here be dragons... lol. As for interesting undergrad project you could either checkout my project gccpy or gccgo or PHC (php compiler Paul Biggar's phd is a good read for this i think he floats on this list) or Parrot. I think are all interesting projects to possibly work on and have a lot of scope. The best part about these is you could sit at a more high-level rather than trying to delve into the depths of code-gen in llvm or gcc which is not a task for the faint hearted trying to find some new innovative algorithm there. As for the backend outputting to C thread here, i dont recommend this as i personally believe it will push you to produce bad IL and middle-end code because you will rely too much on C. And in the end if you want to do more interesting things you may be a little too reliant on C and thus your limited by C. Though outputting C does have its advantage in that you can easily output target code and makes your compiler quite simple but i don't think it gives the full picture. --Phil
[toc] | [prev] | [next] | [standalone]
| From | "C. Bergström" <cbergstrom@pathscale.com> |
|---|---|
| Date | 2011-09-03 15:38 +0700 |
| Message-ID | <11-09-005@comp.compilers> |
| In reply to | #254 |
On 09/ 3/11 12:43 PM, Philip Herron wrote: > I would recommend the experience of implementing a front-end in GCC > its a great area to start implementing a language and you can > concentrate on more specificity what you want to do. (I personally think this comment above is absurd) For c or c-like languages clang is a great place to start. http://clang.llvm.org/get_started.html http://clang.llvm.org/comparison.html#gcc LLVM isn't my preferred choice for compiler backend, but does have a fairly low entry point for new engineers. I apologize if this seems like a typical post of my editor is better than yours Best, ./C @CTOPathScale
[toc] | [prev] | [next] | [standalone]
| From | George Neuner <gneuner2@comcast.net> |
|---|---|
| Date | 2011-09-03 16:00 -0400 |
| Message-ID | <11-09-006@comp.compilers> |
| In reply to | #254 |
On Sat, 3 Sep 2011 06:43:30 +0100, Philip Herron <redbrain@gcc.gnu.org> wrote: >As for the backend outputting to C thread here, i dont recommend this >as i personally believe it will push you to produce bad IL and >middle-end code because you will rely too much on C. As a back end for a compiler I would agree with this. However, by coincidence, in another forum I am involved in a discussion regarding embedding a parser generator into a device so that its comm interface can be reconfigured or extended in the field by the end user. (There actually is a good reason for wanting to provide this 8-) Because most parser tools generate C source, the issue of embedding a C compiler also came up. >--Phil George
[toc] | [prev] | [next] | [standalone]
| From | torbenm@diku.dk (Torben Ægidius Mogensen) |
|---|---|
| Date | 2011-08-31 11:15 +0200 |
| Message-ID | <11-08-032@comp.compilers> |
| In reply to | #246 |
jgk@panix.com (Joe keane) writes: > Volker Birk <bumens@dingens.org> writes: >>If you want to compile a programming language to assembler code > > I'm not sure why anyone would want to do this; if you have another > language you can convert it to C code, and concentrate on what you are > doing. If you find that the C->assembly step can be improved that is > also useful. C is not always well suited as a target language for a compiler (even though it is often used for this purpose). For example: - Standard C does not support indirect jumps. - Few C compilers support tail-call optimisation. - C does not support multiple return values from a function call. - Exceptions are not supported and are difficult to implement efficiently and portably. - Finding the root set for tracing garbage collectors is not easy. - Multi-word integer arithmetic is not supported, even though hardware often does. - A lot of C behaviour is defined as implementation dependent, so you can not be sure your code works the same on all machines/compilers. So there can be plenty of reasons to compile all the way to assembly code. Or use something like LLVM as target. Torben
[toc] | [prev] | [next] | [standalone]
| From | Volker Birk <bumens@dingens.org> |
|---|---|
| Date | 2011-08-31 10:02 +0000 |
| Message-ID | <11-08-033@comp.compilers> |
| In reply to | #246 |
Joe keane <jgk@panix.com> wrote: > Volker Birk <bumens@dingens.org> writes: >>If you want to compile a programming language to assembler code > I'm not sure why anyone would want to do this; if you have another > language you can convert it to C code, and concentrate on what you are > doing. If you find that the C->assembly step can be improved that is > also useful. Well, for example in a C compiler ;-) If we're talking about compiling to machine code today, usually we're using a framework like GCC or LLVM anyways. Yours, VB. -- Wenn Du fC<r eine Leistung nichts bezahlst, bist Du nicht der Kunde, sondern die Ware.
[toc] | [prev] | [next] | [standalone]
| From | BGB <cr88192@hotmail.com> |
|---|---|
| Date | 2011-08-06 14:10 -0700 |
| Message-ID | <11-08-008@comp.compilers> |
| In reply to | #223 |
On 8/6/2011 10:28 AM, amit karmakar wrote:
> I would like to have some suggestions as to what *new* and
> *innovative* project i can do which are based on compiler design.
> Also, considering the time i have to implement the compiler, i can
> think of cutting down work, like working on subset of a language. I
> would preferably not tend to work on only a specific part(phase) of
> compiler. It will be better if I implement a complete compiler for
> some architecture and see the executable running.
new+innovative and compilers, don't often go together, and another
problem is that terms like new/innovative/interesting/... depend
highly on who one is dealing with and their personal biases and
preferences (a cool idea for one person, may be considered stale,
boring, unworkable, ... by another).
a few thoughts:
most traditional research into compilers has been in how to squeeze as
much performance as possible out of them. maybe one can look into trying
for new and interesting features instead.
rather than work on subset languages, maybe it may make sense to work
with a simpler language design.
for example, a fairly simple language is Scheme (except for a few edge
cases) where often a person can throw together a working implementation
fairly quickly (or, at least IME with R5RS and earlier, dunno about R6RS
as I was mostly no longer dealing much with Scheme by this point, and
R6RS at the time looked a bit strange vs what came before).
a slightly less simplistic, but still relatively simple language, is
ECMAScript (basic core language for JavaScript, ActionScript, ...).
probably not worth trying to implement up-front are languages like:
C or C++ (fairly complex languages to implement);
Java (a lot more hairy than it looks, syntax can be deceiving);
...
note that dynamic typing generally makes things much easier to implement
(static typing makes things faster, and is "closer to the metal", ...
but it doesn't make things easier).
a more recent language of mine is using a "soft typing" model, which
basically combines elements of static typing on top of an otherwise
dynamically-typed VM (potentially using types as optimization hints in
the codegen, but treating type-checking, behavioral semantics, and
optimization, as separate issues).
personally, I like RPN / Stack-Machine style ILs (recently got into a
big argument over this though, a person who for whatever reason really
dislikes stack-machine ILs despite them being well proven in the JVM,
.NET, AVM2, ...).
examples of stack-machine languages would include Forth, PostScript,
Factor, ... (PostScript has had a notable influence on the design of my
ILs).
the upside of stack machines is that they are fairly easy to produce
code for (it is often very straightforward to unwind an AST into a stack
machine format), are themselves relatively simple, and are very capable
despite their relative simplicity.
a downside though is that they are relatively fussy about ordering
issues, and a general-purpose native codegen can get a bit hairy (mostly
due to ABI interfacing, for example, the SysV/AMD64 ABI is itself a
complex beast, and one has to effectively "pull a rabbit out of a hat"
to mesh it up directly with a stack machine IL). they are also far less
"du jour" with many people than are other options, such as TAC-SSA
(Three Address Code - Static Single Assignment).
granted, things should be much simpler if one doesn't want to go about
trying to directly call into native (statically-compiled) code, but
instead uses special functions to marshal the calls (I have later found
that this strategy can be fairly transparent as well).
also possibly useful is allowing for eval/... as well...
also, in my case, working to try to make the C interface fairly
transparent (marshaling calls and data-types and similar in both
directions, ideally eliminating nearly all cases of manually-written
boilerplate code).
ideally, the time of isolated languages and frameworks, and of languages
which don't have features like eval, will soon be nearing an end (this
doesn't mean I want many of the existing languages to go away, but
ideally most should have eval as a relatively common library feature, ...).
for example, my language has:
"native import C.foo;"
which allows implementing libraries from C land (the foo is a library
name, and where a tool is used to mine information from C headers/...).
"native package C.foo { ...body... }"
allows exporting the code ("...body...") to C land (in this case, the
boilerplate is written automatically by a tool).
granted, yes, none of this is really terribly new or original, as most
of this has been around for decades.
as for languages containing some interesting ideas:
Scheme (nice core language design);
Self (nice object system, partly carried over in a limited form into
JavaScript);
PostScript (relatively clean stack-machine model);
ECMAScript / JavaScript (simplistic yet conventional syntax);
ActionScript (like JavaScript but more "grown up");
Erlang (concurrent programming features);
...
granted, to be original, one needs to be, errm, original.
like maybe try to come up with some new/interesting language feature or
idea to try exploring, or something interesting to do at the
compiler/codegen level, ...
[toc] | [prev] | [next] | [standalone]
| From | "BartC" <bc@freeuk.com> |
|---|---|
| Date | 2011-08-09 12:06 +0100 |
| Message-ID | <11-08-012@comp.compilers> |
| In reply to | #223 |
"amit karmakar" <amit.codename13@gmail.com> wrote in message > I am an undergraduate in computer science. I have been reading about > compilers recently. I wish to do a project as a part of my study > course in a time span of 6 months. I am pretty much aware of the fact > that projects on compilers require much time. > > I would like to have some suggestions as to what *new* and > *innovative* project i can do which are based on compiler design. > Also, considering the time i have to implement the compiler, i can > think of cutting down work, like working on subset of a language. I > would preferably not tend to work on only a specific part(phase) of > compiler. It will be better if I implement a complete compiler for > some architecture and see the executable running. Don't have any specific suggestions except to say I did a similar sort of project for my undergraduate course. I was handed a 3-inch thick assembler listing, of a half-finished port of an existing compiler, and told to get it working! In fact I just started again, and for good measure implemented the language (a very simple one that could only just about be called high-level) in itself (which I thought was neat, although my supervisor wasn't that impressed..). And yes it is more satisfying to have something that is all your own work and that also works exactly as expected! -- Bartc
[toc] | [prev] | [next] | [standalone]
| From | Gene <gene.ressler@gmail.com> |
|---|---|
| Date | 2011-08-10 00:53 -0700 |
| Message-ID | <11-08-013@comp.compilers> |
| In reply to | #223 |
On Aug 6, 1:28 pm, amit karmakar <amit.codenam...@gmail.com> wrote: > I am an undergraduate in computer science. I have been reading about > compilers recently. I wish to do a project as a part of my study > course in a time span of 6 months. I am pretty much aware of the fact > that projects on compilers require much time. > > I would like to have some suggestions as to what *new* and > *innovative* project i can do which are based on compiler design. > Also, considering the time i have to implement the compiler, i can > think of cutting down work, like working on subset of a language. I > would preferably not tend to work on only a specific part(phase) of > compiler. It will be better if I implement a complete compiler for > some architecture and see the executable running. > > Thanks in advance, > Please bear with my bad english. Your English is excellent. Thanks for taking the time to be clear. Consider checking your own school and perhaps others near you to see if there is a computer architecture course that requires students to implement a processor in programmable logic. Many do these days. By definition such processors are "new" because they're invented for teaching purposes. Either have your advisor or you yourself convince the professor in charge of the architecture course or another student to take on the project of running binaries produced by your compiler. Agree on a way to load binaries in advance. Then build a C subset compiler for that processor. Start with integers, characters and associated pointers and arrays. You can add records, etc. if there's time. (There probably won't be.) I recommend using an existing complete C grammar and merely raise errors for parts of C you don't implement. But you could also look at textbooks that include C subset grammars. Louden is one I recall. My students have done independent study projects along these lines. They always learn a lot. There is tremendous pride when code your compiler generated runs on a processor built by a colleague. Perfection would be getting your compiler to compile itself and run on the target, but that's probably more work than an undergrad should put into one project. If you can't find a suitable target, you might look at http://www.eecs.usma.edu/download/Marasm13.zip and convince whoever runs the architecture course to use this one. It already has an assembler and emulator, which allow you to focus on compilation. If you decide to do this, write me for the newest version.
[toc] | [prev] | [next] | [standalone]
| From | Hans Aberg <haberg-news@telia.com> |
|---|---|
| Date | 2011-08-10 23:56 +0200 |
| Message-ID | <11-08-014@comp.compilers> |
| In reply to | #223 |
On 2011/08/06 19:28, amit karmakar wrote: > I am an undergraduate in computer science. I have been reading about > compilers recently. I wish to do a project as a part of my study > course in a time span of 6 months. I am pretty much aware of the fact > that projects on compilers require much time. The LLVM tutorial contains an example compiler; its parser might be replaced by a Flex/Bison combination. http://llvm.org/docs/tutorial/ Hans
[toc] | [prev] | [next] | [standalone]
| From | tm <thomas.mertes@gmx.at> |
|---|---|
| Date | 2011-08-30 23:17 -0700 |
| Message-ID | <11-08-031@comp.compilers> |
| In reply to | #223 |
On 6 Aug., 19:28, amit karmakar <amit.codenam...@gmail.com> wrote: > I am an undergraduate in computer science. I have been reading about > compilers recently. I wish to do a project as a part of my study > course in a time span of 6 months. I am pretty much aware of the fact > that projects on compilers require much time. Maybe you could base your work on an existing project like GCC or LLVM. Both projects have probably tasks in different sizes. You could also write a new frontend or backend for them. I have a compiler project myself, so people are probably disappointed, when I do not suggest something related to it. :-) So here it comes: A Seed7 frontend for GCC or LLVM would certainly be an interesting challenge. It would contain a lot of middle end work like the conversion from one internal program representation to another. The existing Seed7 to C compiler could be used as base. Send me mail if you are interested... Greetings Thomas Mertes -- Seed7 Homepage: http://seed7.sourceforge.net Seed7 - The extensible programming language: User defined statements and operators, abstract data types, templates without special syntax, OO with interfaces and multiple dispatch, statically typed, interpreted or compiled, portable, runs under linux/unix/windows.
[toc] | [prev] | [next] | [standalone]
| From | Christophe de Dinechin <christophe.de.dinechin@gmail.com> |
|---|---|
| Date | 2011-09-04 23:42 -0700 |
| Message-ID | <11-09-008@comp.compilers> |
| In reply to | #223 |
On Aug 6, 7:28 pm, amit karmakar <amit.codenam...@gmail.com> wrote:
> I would like to have some suggestions as to what *new* and
> *innovative* project i can do which are based on compiler design.
Innovation in compilers can happen at a number of levels :
1. Parsing techniques, grammars, etc. Very active research a while
back, considered (erroneously methinks) as dead by most today, who
happily use flex/bison and don't think twice about it.
2. Language design. One of the active areas these days is "domain
specific languages" or DSLs, i.e. languages designed for one specific
need. Often using "meta-programming" techniques (programs that
generate programs)
3. Type systems, proofs, program validation. Languages like Haskell
use type inference, so that you don't have to specify types yourself
most of the time. C++ recently gained the "auto" keyword for types.
DSLs pose a new class of interesting problems in that space.
4. Intermediate representations, code generation and optimization
frameworks. The king of this hill these days IMO is LLVM. But there
are a number of contenders. If you are interested in optimizations,
that's the right place to look at.
5. Runtime support : garbage collectors, just-in-time code generation,
parallel execution, use of new hardware such as GPUs,
6. Support for innovative hardware, hardware generation, hardware/
software co-design, etc. If you are more into silicon, this is a very
interesting are to learn about.
My own pet project, XLR (http://xlr.sf.net) offers a number of
innovations in the first three of these areas. It is a language
designed to grow with the user, i.e. the objective is to make it as
easy to add language constructs as it is to add, say, functions or
classes in other languages.
Regarding parsing, it generates a parse tree made of exactly 8 nodes :
integer, real, text and name/symbol represent leaves of the tree,
infix, prefix, postfix and block represent inner nodes. This makes it
possible to write programs in a very natural-looking way, yet with an
internal program representation that is easy to manipulate. This is
the foundation of XL meta-programming / DSL capabilities.
To validate that, XLR has practically no built-in constructs. It has
constructs to connect to LLVM primitives, constructs to connect to C
code, and a pair of "rewrite" constructs, notably ->, to transform one
tree shape into another. For example :
extern bool puts(text);
(x:integer - y:integer):integer -> opcode Sub
repeat 0, B -> true
repeat N, B -> B; repeat N-1, B
repeat 25,
puts "Hello"
You can check the code generated for the above with xlr -tcode -O3
tests/09.Compiler/optimized-repeat-loop.xl. LLVM actually turns it
into a sequence of 25 calls to puts, you can hardly do better.
The most active area of research for XLR these days is its type
system. In order to generate efficient code, an Haskell-like type
inference mechanism is in place. But the standard type inference
algorithms must be extended, because there are a few additional
transformations compared to lambda calculus (not just "alpha" and
"beta"), and the closest there is to a type is the shape of a tree
(e.g. "if X then Y else Z").
Since it uses LLVM, it is also an interesting way to learn a little
about LLVM, but it's not intended as an LLVM tutorial.
So if you are interested in experimenting with "growing a language" in
a text-based framework, XLR is the right way to go. There are other
projects that are more advanced e.g. if you want to build the IDE at
the same time, see for example JetBrain's Meta Programming System. But
they are not as strong in language development per-se, I believe.
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web