Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #2978 > unrolled thread
| Started by | Derek Jones <derek@NOSPAM-knosof.co.uk> |
|---|---|
| First post | 2022-04-25 00:00 +0100 |
| Last post | 2022-04-25 12:06 -0400 |
| Articles | 9 — 5 participants |
Back to article view | Back to comp.compilers
Programming language similarity Derek Jones <derek@NOSPAM-knosof.co.uk> - 2022-04-25 00:00 +0100
Re: Programming language similarity Derek Jones <derek@NOSPAM-knosof.co.uk> - 2022-04-25 08:59 +0100
Re: Programming language similarity Fernando <pronesto@gmail.com> - 2022-04-25 04:24 -0700
Re: Programming language similarity Derek Jones <derek@NOSPAM-knosof.co.uk> - 2022-04-25 19:35 +0100
Re: Programming language similarity Jan Ziak <0xe2.0x9a.0x9b@gmail.com> - 2022-04-25 06:00 -0700
Re: Programming language similarity Derek Jones <derek@NOSPAM-knosof.co.uk> - 2022-04-25 20:51 +0100
Re: Programming language similarity gah4 <gah4@u.washington.edu> - 2022-04-25 14:58 -0700
Re: Programming language similarity Derek Jones <derek@NOSPAM-knosof.co.uk> - 2022-04-26 00:50 +0100
Re: Programming language similarity Meshach Mitchell <meshach.mitchell@gmail.com> - 2022-04-25 12:06 -0400
| From | Derek Jones <derek@NOSPAM-knosof.co.uk> |
|---|---|
| Date | 2022-04-25 00:00 +0100 |
| Subject | Programming language similarity |
| Message-ID | <22-04-012@comp.compilers> |
All, There has been remarkably little work that tries to measure programming language similarity. Yes, there are many multi-language runtime benchmark comparisons, and people extract data from Wikipedia to made dubious claims. Does anybody know of other kinds of attempts at measuring language similarity? Here is one approach https://shape-of-code.com/2022/04/24/programming-language-similarity-based-on-their-traits/ [That seems awfully simplistic. Fortran and PL/I both have FORMAT statements that look superficially similar but the semantics are very different. -John]
[toc] | [next] | [standalone]
| From | Derek Jones <derek@NOSPAM-knosof.co.uk> |
|---|---|
| Date | 2022-04-25 08:59 +0100 |
| Message-ID | <22-04-013@comp.compilers> |
| In reply to | #2978 |
John, > https://shape-of-code.com/2022/04/24/programming-language-similarity-based-on-their-traits/ > [That seems awfully simplistic. Fortran and PL/I both have FORMAT statements that look > superficially similar but the semantics are very different. -John] Many keywords have different meanings, e.g., the do keyword in Fortran/C. Even binary operators differ, binary plus for string concatenation. The blog post uses a token based approach, which does not require lots of time to gather the data. A semantics based approach requires lots of head scratching. I made a start by collecting information on function definitions (mostly forms of argument passing). The semantic traits I looked at tended to have a small number of characteristics, so some form of aggregating is needed to create significant differences.
[toc] | [prev] | [next] | [standalone]
| From | Fernando <pronesto@gmail.com> |
|---|---|
| Date | 2022-04-25 04:24 -0700 |
| Message-ID | <22-04-014@comp.compilers> |
| In reply to | #2978 |
Hi Derek, Your repository is very nice! Can I use the "language info" part in the class on programming language paradigms? It will be nice to give students some idea about the number of keywords in different programming languages, for instance. By the way, perhaps you should consider also comparing the languages with regards to the static and the dynamic aspects of their type systems, e.g.: typing discipline (static, dynamic, gradual?), type verification (inference, annotations, mixed?), type enforcement (weak, strong), static type equivalence (nominal, structural, mixed?), etc. That might lead to very different trees. For instance, in your keyword tree, Java and JavaScript are close, but they are very different semantically. > Does anybody know of other kinds of attempts at measuring language similarity? About that: I don't know of other studies. There is the article on Wikipedia (Programming Languages Comparison), but it does not cite a paper with a comparative study. Regards, Fernando
[toc] | [prev] | [next] | [standalone]
| From | Derek Jones <derek@NOSPAM-knosof.co.uk> |
|---|---|
| Date | 2022-04-25 19:35 +0100 |
| Message-ID | <22-04-018@comp.compilers> |
| In reply to | #2980 |
Fernando, > Your repository is very nice! Can I use the "language info" part in the class > on programming language paradigms? It will be nice to give students some idea Please do. The code is under a GPL license. > about the number of keywords in different programming languages, for > instance. I was surprised by the diversity of words used. > By the way, perhaps you should consider also comparing the languages with > regards to the static and the dynamic aspects of their type systems, e.g.: > typing discipline (static, dynamic, gradual?), type verification (inference, > annotations, mixed?), type enforcement (weak, strong), static type equivalence > (nominal, structural, mixed?), etc. That might lead to very different trees. I looked into building a tree based on allowed implicit types, with the hope of coming up with a measure of strong/week typing. A list of implicit conversions performed by a language seems like a good start. But this approach makes Fortran 77 look like it's strongly typed; there are fewer implicit conversions than other languages because it supports fewer types, e.g., no enums or pointers. C's relatively large number of integer types, and the corresponding implicit conversions, make it look weakly typed compared to languages with fewer integer types (and hence fewer implicit conversions). The list of characteristics you list might be combined in some meaningful way, such that a type 'distance' tree could be constructed. Lots of careful reading of language specifications would be needed to figure out the details. > About that: I don't know of other studies. There is the article on Wikipedia > (Programming Languages Comparison), but it does not cite a paper with a > comparative study. Some of the Yes/No classifications on this page are somewhat surprising (at least to me) https://en.wikipedia.org/wiki/Comparison_of_programming_languages
[toc] | [prev] | [next] | [standalone]
| From | Jan Ziak <0xe2.0x9a.0x9b@gmail.com> |
|---|---|
| Date | 2022-04-25 06:00 -0700 |
| Message-ID | <22-04-016@comp.compilers> |
| In reply to | #2978 |
On Monday, April 25, 2022 at 4:49:03 AM UTC+2, Derek Jones wrote: > All, > > There has been remarkably little work that tries to measure > programming language similarity. > > Yes, there are many multi-language runtime benchmark comparisons, and > people extract data from Wikipedia to made dubious claims. > > Does anybody know of other kinds of attempts at measuring language > similarity? ... Just some "food for thought" on a conceptually similar topic: Denis Roegel: A brief survey of 20th century logical notations (https://hal.inria.fr/hal-02340520/document) -atom
[toc] | [prev] | [next] | [standalone]
| From | Derek Jones <derek@NOSPAM-knosof.co.uk> |
|---|---|
| Date | 2022-04-25 20:51 +0100 |
| Message-ID | <22-04-019@comp.compilers> |
| In reply to | #2982 |
Jan, > Denis Roegel: A brief survey of 20th century logical notations (https://hal.inria.fr/hal-02340520/document) This is an interesting collection of decisions made by authors over 120 years. What makes somebody choose a particular set of symbols. My guess is that their past experience is a major factor, i.e., the use of symbols they had previously been exposed to. Of course it could be something as mundane as the characters available on their typewriter, or their printer of the journal the work was published in. Then again, academics do love to do their own thing. Perhaps the decisions are based on the need to be different.
[toc] | [prev] | [next] | [standalone]
| From | gah4 <gah4@u.washington.edu> |
|---|---|
| Date | 2022-04-25 14:58 -0700 |
| Message-ID | <22-04-020@comp.compilers> |
| In reply to | #2985 |
On Monday, April 25, 2022 at 1:54:58 PM UTC-7, Derek Jones wrote: (snip) > What makes somebody choose a particular set of symbols. > My guess is that their past experience is a major factor, > i.e., the use of symbols they had previously been exposed to. Early Fortran was limited by the number of characters available on the IBM 026 keypunch. They redefined some of the punch codes with different symbols for scientific use, as that was easier than designing a whole new machine. Much of that was then fixed with EBCDIC in S/360, where an 8 bit code allowed, and pretty much required, that they be separated. In any case, the characters (with new punches) were kept. (And new compilers have an option to accept the old punch codes.) I do remember punching ALGOL programs on the 026, where you had to use the multipunch key, along with big charts on the wall, to get the needed characters. In any case, character set limitations stay with us long after the reason for the limitation has gone.
[toc] | [prev] | [next] | [standalone]
| From | Derek Jones <derek@NOSPAM-knosof.co.uk> |
|---|---|
| Date | 2022-04-26 00:50 +0100 |
| Message-ID | <22-04-022@comp.compilers> |
| In reply to | #2986 |
gah4, > In any case, character set limitations stay with us long after > the reason for the limitation has gone. More than you probably wanted to know about character set history still being with us https://archive.org/details/mackenzie-coded-char-sets
[toc] | [prev] | [next] | [standalone]
| From | Meshach Mitchell <meshach.mitchell@gmail.com> |
|---|---|
| Date | 2022-04-25 12:06 -0400 |
| Message-ID | <22-04-017@comp.compilers> |
| In reply to | #2978 |
I could see how that could be interesting as an academic pursuit, but I think the dearth of exploration here is most likely because pretty much anyone in a position to do that already knows that every turing complete language is equivalent. The comparison, therefore, would be a comparison of placement of syntactic sugar. I have trouble visualizing a real-world use for such a comparison, by which I mean, what is the problem that I would be able to solve by knowing which languages are similar? In the current environment, anywhere you would work already has a whole tech stack already mapped out. I have actually thought about this, and vaguely remember looking up articles on the subject. The article you linked is interesting, but I agree with your analysis; semantic similarity has some value but IMO what really matters is "supported patterns". ie. what a language provides "for free". Now., TINSTAAFL, so there is no real "free" but there is some optimization done by a language [compiler, interpreter] to support statements represented in the grammar. An example that comes to mind is in javascript (I know, I *know*, but I have a family, and we need to eat.) Early implementations of async in js used the *Promise* object to implement asynchronous execution, but newer versions of the language use *async* and *await* keywords. The former piggy-backs on the existing OO architecture, while the latter, implemented as keywords, is available to lower level abstraction and optimization. We've been doing this long enough that a number of "higher level" patterns have emerged. The aforementioned asynchronous (threaded, maybe?) execution is one. *Events* also come to mind, which are generally implemented as good old-fashioned polling under the hood or function registration and hash-lookup. What is actually happening in the machine translates to vastly different computation cost, and seems to me to be non-trivial. I think a meaningful categorization could be done based on this idea of language "provisions" over language semantics, and some deeper analysis of how exactly a language [compiler, interpreter] implements what necessarily boils down to syntactic sugar. To answer your actual question, No, I don't know of other attempts, but I can understand the scarcity. Hope my thoughts have some value. -- Meshach Mitchell On Sun, Apr 24, 2022 at 10:49 PM Derek Jones <derek@nospam-knosof.co.uk> wrote: > All, > > There has been remarkably little work that tries to measure > programming language similarity. > > Yes, there are many multi-language runtime benchmark comparisons, and > people extract data from Wikipedia to made dubious claims. > > Does anybody know of other kinds of attempts at measuring language > similarity?
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web