Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.albasani.net!news.mb-net.net!open-news-network.org!news.snarked.org!newsfeed.news.ucla.edu!usenet.stanford.edu!news.iecc.com!nerds-end From: "Pascal J. Bourguignon" Newsgroups: comp.programming,comp.compilers,comp.editors Subject: Re: New editor/Integrated Development Environment/compiler Date: Tue, 12 Apr 2011 06:00:09 +0200 Organization: Informatimago Lines: 198 Sender: news@iecc.com Approved: comp.compilers@iecc.com Message-ID: <11-04-028@comp.compilers> References: <11-04-009@comp.compilers> <11-04-011@comp.compilers> <11-04-024@comp.compilers> NNTP-Posting-Host: news.iecc.com X-Trace: gal.iecc.com 1302897053 1287 64.57.183.58 (15 Apr 2011 19:50:53 GMT) X-Complaints-To: abuse@iecc.com NNTP-Posting-Date: Fri, 15 Apr 2011 19:50:53 +0000 (UTC) Keywords: editor, code Posted-Date: 15 Apr 2011 15:50:52 EDT X-submission-address: compilers@iecc.com X-moderator-address: compilers-request@iecc.com X-FAQ-and-archives: http://compilers.iecc.com Face: iVBORw0KGgoAAAANSUhEUgAAADAAAAAwAQMAAABtzGvEAAAABlBMVEUAAAD///+l2Z/dAAAA oElEQVR4nK3OsRHCMAwF0O8YQufUNIQRGIAja9CxSA55AxZgFO4coMgYrEDDQZWPIlNAjwq9 033pbOBPtbXuB6PKNBn5gZkhGa86Z4x2wE67O+06WxGD/HCOGR0deY3f9Ijwwt7rNGNf6Oac l/GuZTF1wFGKiYYHKSFAkjIo1b6sCYS1sVmFhhhahKQssRjRT90ITWUk6vvK3RsPGs+M1RuR mV+hO/VvFAAAAABJRU5ErkJggg== Xref: x330-a1.tempe.blueboxinc.net comp.programming:237 comp.compilers:95 comp.editors:80 HiramEgl writes: > I'm specially concerned about how the source-structure is stored. I > don't want it to be stored in a file because is not flexible. I would > like to have the source-structure stored in some binary format > independent of user-languages or programming-languages, ideally. It might be liberating to think only about objects (as in OO). We may still have to use file to implement persistency of these objects (as mentionned in another answer, file systems are a kind of database that has its advantages too), but you can ignore it for now, since it should be considered an implementation detail. (Objects could as well be stored in a OO database, or mapped to a relationnal database. Another quite interesting option, but unfortunately totally theoric for now, would be to have a persistent system such as EROS http://www.eros-os.org/ ; but this is what you have in a first approximation when you have an image based environment such as lisp or Smalltalk, the only difference with EROS, is that in Lisp or Smalltalk, you have to remember saving the image yourself, while in EROS it's done automatically). > With the help of some translation tables it would be possible to > regenerate source code in a specific user-language or programming-language. \> > For example, if the user types: > int Result; > > then the editor would have the capability of understanding that the > first word is a variable type and the second word is a variable > identifier. It would create a binary representation: > > 50 9899 > > and update a translation table: > > Id English > ------------- > 50 int > 9899 Result > > Afterwards, another user might update the translation table for other > language: > > Id English Espaqol > ------------------------- > 50 int entero > 9899 Result Resultado > > And regenerate the source code in another language: > > entero Resultado; There exist already a binary format to serialize and deserialize syntactic trees. Each node of the syntactic tree is serialized as follow (all bytes given in decimal): node ::= 40 { 20 } 41 . node-label ::= identifier is a sequence of non-space characters encoded in ASCII (most special characters are also allowed, apart a few exception). child ::= | | | | . integer is a sequence of digits encoded in ASCII possibly preceeded by a sign. floating-point is likewise. string ::= 34 34 So for example: 40 73 70 32 40 61 32 65 32 50 41 32 40 80 82 73 78 84 32 34 101 113 117 97 108 34 41 32 40 68 69 67 70 32 65 41 41 represents the syntactic tree: if / | \ / | \ = print decf / \ | | a 2 "equal" a Of course, when you translate this tree to text, you can generate: SI a=2 ALORS AFFICHE "equal" SINON DECREMENTE a or: if a=2 then begin write("equal"); end else dec(a); or: if(a==2){printf("equal")}else{a--} depending on the preferences of the programmer. (An unsophisticated programmer (what is usually and depreciatedly called a "real programmer") would edit the binary directly, which would give, with an ASCII editor: (if (= a 2) (print "equal") (decf a)) [perhaps with the help of paredit thought ;-)]) > All this comes from the frustration of having the structure of > algorithms or designs trapped in a specific user-language or > programming-language. Because, I think that a lot of knowledge is > trapped in source code written in english. I would like to have a tool > that would help me to reuse very easily the structure of algorithms > and designs written in other applications. But apart from some popular languages who aren't really different at the semantic level (eg Pascal and C share mostly the same semantics), in most cases the semantic differences are the real problem. Let's take a simple example: (defun fact (x) (if (= 1 x) 1 (* x (fact (- x 1))))) int fact(int x){ return (1==x) ? 1 : x*fact(x-1);} fact(17) in C will return -288522240, while (fact 17) in lisp will return 355687428096000. And we're not even speaking of object systems, type systems, exception handling, etc. If the purpose is to reuse algorithms written in specific languages, it means 90% of the time, C or C++, and unfortunately since those are very low level programming languages, the implementation of the algorithms in these languages is mirred by implementation details entirely irrelevant to the algorithms. Filtering out those details is not easy. In practice, you can reuse algorithm written in language X from language Y, by using what is called a FFI, a Foreign Function Interface. When the languages X and Y share some "common language", such as the way the manage memory, the representation of values of various types, etc, it's easy enough. We can call C functions from a Pascal program, and vice-versa. We may also call Fortran functions from C programs, but even this is already more delicate. We can also do something even more complex, crossing borders of quite different environments, such as controlled environments (Smalltalk, Python, Lisp, etc) and uncontrolled environments such as C, but the cost becomes higher, since each FFI function call involves some value conversion. And to revert to the example above, using a library such as libecl (http://ecls.sourceforge.net), you could call from C the function fact written and compiled in Lisp, and get the correct result 355687428096000 in your C program. But since this is not a value that can be handled by any native C type, you cannot use it easily, you cannot pass it to printf, basically, you cannot do anything with it in C, you have to call library functions in libecl to deal with it. So even for a simple arithmetic algorithms, the languages are so different that you cannot really cross the border, you may just play puppets across the FFI. And if you want more fun, try to write a subclass of a C++ class in Smalltalk or vice-versa. > For example, I would like to drag-and-drop a quicksort algorithm into > an application that later I could regenerate into "c" source-code in > Spanish or ruby source-code in Swedish. > > I'm interested in transporting algorithms, designs, architectures across > user-languages, programming-languages, platforms, etc. IMO, the only way to do it is to hire programmers, or to implement Strong AI, and once you have Strong AI, you don't need to translate programs anymore (the AI does it itself). -- __Pascal Bourguignon__ http://www.informatimago.com/ A bad day in () is better than a good day in {}.