Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2096 > unrolled thread

Compiler implementation language preference ?

Started byMichael Justice <nullcompiler@gmail.com>
First post2018-05-22 04:58 -0400
Last post2018-12-21 04:17 -0800
Articles 12 — 9 participants

Back to article view | Back to comp.compilers


Contents

  Compiler implementation language preference ? Michael Justice <nullcompiler@gmail.com> - 2018-05-22 04:58 -0400
    Re: Compiler implementation language preference ? w.clodius@icloud.com (William Clodius) - 2018-05-23 10:14 -0600
    Re: Compiler implementation language preference ? Walter Banks <walter@bytecraft.com> - 2018-06-07 12:58 -0400
    Re: Compiler implementation language preference ? rockbrentwood@gmail.com - 2018-11-09 14:29 -0800
      Re: Compiler implementation language preference ? Kaz Kylheku <157-073-9834@kylheku.com> - 2018-11-10 04:00 +0000
      Re: Compiler implementation language preference ? Kaz Kylheku <157-073-9834@kylheku.com> - 2018-11-10 04:20 +0000
      Re: Compiler implementation language preference ? Richard <portempa@aon.at> - 2018-11-10 15:06 +0100
      Re: Compiler implementation language preference ? Walter Banks <walter@bytecraft.com> - 2018-11-10 21:46 -0500
    Re: Compiler implementation language preference ? Nick <ibeam2000@gmail.com> - 2018-11-13 02:14 -0800
    Re: Compiler implementation language preference ? Aaron Gray <aaronngray@gmail.com> - 2018-12-19 11:54 -0800
      Re: Compiler implementation language preference ? steve kargl <sgk@REMOVEtroutmask.apl.washington.edu> - 2018-12-19 23:19 +0000
      Re: Compiler implementation language preference ? Aaron Gray <aaronngray@gmail.com> - 2018-12-21 04:17 -0800

#2096 — Compiler implementation language preference ?

FromMichael Justice <nullcompiler@gmail.com>
Date2018-05-22 04:58 -0400
SubjectCompiler implementation language preference ?
Message-ID<18-05-009@comp.compilers>
Is there any preference to writing a compiler in say c instead of say
java, fortran, basic etc? I ask cause i see many of the projects using
either c or c++ instead of other programming languages.

Sincerely,

nullCompiler
[Mostly people use what they're used to, or in languages that are easy
to bootstrap on the machines they want to use.  IBM's Fortran H
compiler was famously written in itself, but I wouldn't write a new
compiler in Fortran because it doesn't have great data structuring or
dynamic storage management.  (Yes, I know that Fortran 2008 is a lot
different from Fortran 66.) -John]

[toc] | [next] | [standalone]


#2098

Fromw.clodius@icloud.com (William Clodius)
Date2018-05-23 10:14 -0600
Message-ID<18-05-011@comp.compilers>
In reply to#2096
Michael Justice <nullcompiler@gmail.com> wrote:

> <snip>> [Mostly people use what they're used to, or in languages that are easy
> to bootstrap on the machines they want to use.  IBM's Fortran H
> compiler was famously written in itself, but I wouldn't write a new
> compiler in Fortran because it doesn't have great data structuring or
> dynamic storage management.  (Yes, I know that Fortran 2008 is a lot
> different from Fortran 66.) -John]

With pointers and allocatable arrays I don't see Fortran lacking in
allocatable storage management. With derived types and inheritance I
don't see it lacking in data structuring compared to, say, ISO C.

Allocatable length character strings has largely eliminated the
weaknesses of its character type, though in practice it is probably
best to deal with Unicode encodings by mapping them to a standatard
encoding using eight bit integers. The ISO_FORTRAN_ENV module allows
it to portably deal with multiple sized integers, so that such things
as UTF-8 files can be handled. Fortran compatible pre-processors
exist, though they are not as well known as the C pre-processor. What
it does lack, compared to ISO C, are type casts, and unsigned
integers. What it lacks compared to C++ is templates. More importantly
it has a user community focussed on numerics, that is relatively
unfamiliar with the types of algorithms used in compilers, and is
relativly uninterested in such applications.

[toc] | [prev] | [next] | [standalone]


#2101

FromWalter Banks <walter@bytecraft.com>
Date2018-06-07 12:58 -0400
Message-ID<18-06-002@comp.compilers>
In reply to#2096
On 2018-05-22 4:58 AM, Michael Justice wrote:
> Is there any preference to writing a compiler in say c instead of say
> java, fortran, basic etc? I ask cause i see many of the projects using
> either c or c++ instead of other programming languages.
>
> Sincerely,
>
> nullCompiler
> [Mostly people use what they're used to, or in languages that are easy
> to bootstrap on the machines they want to use.  IBM's Fortran H
> compiler was famously written in itself, but I wouldn't write a new
> compiler in Fortran because it doesn't have great data structuring or
> dynamic storage management.  (Yes, I know that Fortran 2008 is a lot
> different from Fortran 66.) -John]

Most of our compilers (including C compilers) are written in Pascal.

There are two reasons. The strong type checking in the Pascal compiler I
use is an important part of development productivity.

The second reason is Pascal has features that make it well matched to
the implementing a compiler. Pascal's fundamental support for string,
sets and boolean support tends to be very useful and natural to use.  We
regularly use expert systems as part of our code creation process. Our
experience has been that they are easier to implement in Pascal than C.

The final point is an area of personal preference is the scoping
support for local functions.

w..

[toc] | [prev] | [next] | [standalone]


#2113

Fromrockbrentwood@gmail.com
Date2018-11-09 14:29 -0800
Message-ID<18-11-001@comp.compilers>
In reply to#2096
On Tuesday, May 22, 2018 at 12:39:07 PM UTC-5, Michael Justice wrote:
> Is there any preference to writing a compiler in say c instead of say
> java, fortran, basic etc? I ask cause i see many of the projects using
> either c or c++ instead of other programming languages.
>
> nullCompiler
> [Mostly people use what they're used to, or in languages that are easy
> to bootstrap on the machines they want to use.

A test of whether the language, itself, is worth using -- assuming it is a
general purpose language -- is whether you'd be willing to write the compiler,
itself, in it! I put up a branch (and heavily recoded) version of cparse on my
machine, which is in C and has 3 layers of self-bootstrapping. GCC has several
layers of self-bootstrpping, depending on what you implement from it (and
distressingly, it has -- as of version 6 -- acquired *dependencies* on
libraries further upstream! That's a major no no!)

GnuBC has a (largely eliminable) layer of bootstrapping to compile its
predefined libraries into itself.

Knuth's TeX engine is built on top of the (context-sensitive) parser in Web
and/or cweb. The "tangle" and "weave" programs are the core that has to be
bootstrapped. Tangle is Web->Pascal (ctangle cweb->C); weave is Web->TeX,
cweave is cweb->TeX; (and all this is a setup for TeX.web, which has to be
compiled via Web).

Go is also self-built.

A notable gap is that Yacc is not self-compiled; thereby falling short of the
"is it worth using" test!

Code synthesis tools (indent, yacc to some degree, web) are difficult to do
with traditional parsers; since synthesis -- which is an application of the
field of "pragmatics" not "syntax"(!) -- means you have phrase structure
rules, but no start symbol! Instead, you process maximal parsable chunks; and
that generally is what requiring a context-sensitive parser. That's because
the source language has macros (in the case of Web, at least, that's the
reason). Translators all fall into this class too, particularly if the
language has macros. Those have to be handled correctly ideally without
breaking open the black box into the translator output.

The self-compile trick could be extended to theorem provers, since proof
algebras themselves are ... algebraic formalisms. I put up a small part of
Lambek-Scott's higher-order categorical logic/type-theory formalism on top of
Prover9-Mace4 (with difficulty), for instance. A bigger challenge might be to
try to bootstrap compile Martin-Loef's type theory on top of Automath; since
it is a (self-admitted) descendant of Automath.

Fortran prakrits (to coin a phrase) could be bootstrapped on top the old
Sanskrit Fortran (to coin another phrase) by good compiler-writer like ...

> ...but I wouldn't write a new
> compiler in Fortran because it doesn't have great data structuring or
> dynamic storage management.  (Yes, I know that Fortran 2008 is a lot
> different from Fortran 66.) -John]

... John. (It's still an idea, ad-hoc extend the language you write it in and
use scripts to reduce it to Sanskrit in mid-process.)

[toc] | [prev] | [next] | [standalone]


#2115

FromKaz Kylheku <157-073-9834@kylheku.com>
Date2018-11-10 04:00 +0000
Message-ID<18-11-003@comp.compilers>
In reply to#2113
On 2018-11-09, rockbrentwood@gmail.com <rockbrentwood@gmail.com> wrote:
> A notable gap is that Yacc is not self-compiled; thereby falling short of the
> "is it worth using" test!

GNU Bison's grammar is written in Yacc. Amusingly, it is going to town
with Bison extensions, so you need Bison to re build it:

http://git.savannah.gnu.org/cgit/bison.git/tree/src/parse-gram.y

So the user doesn't have to, they keep the generated C in the repo.

And that's about how far anyone can reasonably go in using a Yacc to build
Yacc.

--
TXR Programming Lanuage: http://nongnu.org/txr
Music DIY Mailing List:  http://www.kylheku.com/diy
ADA MP-1 Mailing List:   http://www.kylheku.com/mp1

[toc] | [prev] | [next] | [standalone]


#2116

FromKaz Kylheku <157-073-9834@kylheku.com>
Date2018-11-10 04:20 +0000
Message-ID<18-11-004@comp.compilers>
In reply to#2113
On 2018-11-09, rockbrentwood@gmail.com <rockbrentwood@gmail.com> wrote:
> On Tuesday, May 22, 2018 at 12:39:07 PM UTC-5, Michael Justice wrote:
>> Is there any preference to writing a compiler in say c instead of say
>> java, fortran, basic etc? I ask cause i see many of the projects using
>> either c or c++ instead of other programming languages.
>>
>> nullCompiler
>> [Mostly people use what they're used to, or in languages that are easy
>> to bootstrap on the machines they want to use.
>
> A test of whether the language, itself, is worth using -- assuming it is a
> general purpose language -- is whether you'd be willing to write the compiler,
> itself, in it! I put up a branch (and heavily recoded) version of cparse on my
> machine, which is in C and has 3 layers of self-bootstrapping. GCC has several
> layers of self-bootstrpping, depending on what you implement from it (and
> distressingly, it has -- as of version 6 -- acquired *dependencies* on
> libraries further upstream! That's a major no no!)

A nice bootstrapping method is to build an interpreter for the language
also in some widely available language (like C). The compiler can be
executed by the interpreter to compile itself, plus any other run-time
support code also written in that language.

If the compiler produces that widely-used systems programming language,
then it can just be redistributed in compiled form and an interpreter
need not be included.

That's a tough way to evolve the language, though. An interpreter gives
you a version of the language that is immune to bootstrapping
chicken-egg problems and provides a reference model for what compiled
code should be doing.

You can always revert to the interpreter when things go horribly wrong.

When you make modifications to the compiler and they are so wrong they
break the compiler, you don't have to revert them. Just blow off all
the compiled materials, try too fix your work in the compiler and just
bootstrap from scratch through the stable interpreter. You never need
a last-known-good copy of the compiler in your workspace.

[toc] | [prev] | [next] | [standalone]


#2117

FromRichard <portempa@aon.at>
Date2018-11-10 15:06 +0100
Message-ID<18-11-005@comp.compilers>
In reply to#2113
On 09.11.18 23:29, rockbrentwood@gmail.com wrote:

> A test of whether the language, itself, is worth using -- assuming it is a
> general purpose language -- is whether you'd be willing to write the compiler,
> itself, in it!

This does not prove anything about applicability of the language for
anything other than writing a similar compiler.

Richard
[Good point.  Compilers use a variety of data structures and recursive algorithms
so if you can write a compiler, it's likely an adequate systems language.  On
the other hand, IBM Fortran H was written in itself which only made sense because
the alternative was assembler. -John]

[toc] | [prev] | [next] | [standalone]


#2119

FromWalter Banks <walter@bytecraft.com>
Date2018-11-10 21:46 -0500
Message-ID<18-11-007@comp.compilers>
In reply to#2113
On 2018-11-09 5:29 p.m., rockbrentwood@gmail.com wrote:

>
> A test of whether the language, itself, is worth using -- assuming it
> is a general purpose language -- is whether you'd be willing to write
> the compiler, itself, in it! I put up a branch (and heavily recoded)
> version of cparse on my machine, which is in C and has 3 layers of
> self-bootstrapping. GCC has several layers of self-bootstrpping,
> depending on what you implement from it (and distressingly, it has --
> as of version 6 -- acquired *dependencies* on libraries further
> upstream! That's a major no no!)
>
> GnuBC has a (largely eliminable) layer of bootstrapping to compile
> its predefined libraries into itself.
>
> Knuth's TeX engine is built on top of the (context-sensitive) parser
> in Web and/or cweb. The "tangle" and "weave" programs are the core
> that has to be bootstrapped. Tangle is Web->Pascal (ctangle cweb->C);
> weave is Web->TeX, cweave is cweb->TeX; (and all this is a setup for
> TeX.web, which has to be compiled via Web).
>
> Go is also self-built.

I would argue against that suitability test with the following simple
logic. My choice for implementation language is primarily the most
suitable language to implement the compiler.

Part of what you are suggesting is the compiler bootstrap process that
is a very different process. Even for that I would argue against only
using the same language As one of several possible choices including
cross compiling on another platform.

w..

[toc] | [prev] | [next] | [standalone]


#2120

FromNick <ibeam2000@gmail.com>
Date2018-11-13 02:14 -0800
Message-ID<18-11-008@comp.compilers>
In reply to#2096
One man's opinion: Have a look at D.  https://dlang.org/ My approach
is to get the right answer first, then factor out the GC, then maybe
refactor more to port to C++.  Simple Java can be cut-and-paste ported
to D.

[toc] | [prev] | [next] | [standalone]


#2135

FromAaron Gray <aaronngray@gmail.com>
Date2018-12-19 11:54 -0800
Message-ID<18-12-007@comp.compilers>
In reply to#2096
On Tuesday, 22 May 2018 18:39:07 UTC+1, Michael Justice  wrote:
> Is there any preference to writing a compiler in say c instead of say
> java, fortran, basic etc? ...

> [Mostly people use what they're used to, or in languages that are easy
> to bootstrap on the machines they want to use.  IBM's Fortran H
> compiler was famously written in itself, but I wouldn't write a new
> compiler in Fortran because it doesn't have great data structuring or
> dynamic storage management.  (Yes, I know that Fortran 2008 is a lot
> different from Fortran 66.) -John]

Pity there are no real compiler-compilers anymore, hint-hint, I am working on one to rule them all ;)

Aaron Gray
---
Independent Open Source Software Engineer, Computer Language Researcher, Information Theorist, and amateur computer scientist.
[Please don't say you've invented another UNCOL. -John]

[toc] | [prev] | [next] | [standalone]


#2136

Fromsteve kargl <sgk@REMOVEtroutmask.apl.washington.edu>
Date2018-12-19 23:19 +0000
Message-ID<18-12-008@comp.compilers>
In reply to#2135
Aaron Gray wrote:
> On Tuesday, 22 May 2018 18:39:07 UTC+1, Michael Justice  wrote:
>> Is there any preference to writing a compiler in say c instead of say
>> java, fortran, basic etc? ...
>
>> [Mostly people use what they're used to, or in languages that are easy
>> to bootstrap on the machines they want to use.  IBM's Fortran H
>> compiler was famously written in itself, but I wouldn't write a new
>> compiler in Fortran because it doesn't have great data structuring or
>> dynamic storage management.  (Yes, I know that Fortran 2008 is a lot
>> different from Fortran 66.) -John]

The latest Fortran standard is informally referred to as F2018.
It became the official standard a week or so ago.
https://wg5-fortran.org/f2018.html
[You're right, but I still wouldn't want to write a compiler in it.  That's
not what it's for.  -John]

[toc] | [prev] | [next] | [standalone]


#2138

FromAaron Gray <aaronngray@gmail.com>
Date2018-12-21 04:17 -0800
Message-ID<18-12-010@comp.compilers>
In reply to#2135
On Wednesday, 19 December 2018 20:12:34 UTC, Aaron Gray  wrote:
> On Tuesday, 22 May 2018 18:39:07 UTC+1, Michael Justice  wrote:
> > [Mostly people use what they're used to, or in languages that are easy
> > to bootstrap on the machines they want to use. ...
>
> Pity there are no real compiler-compilers anymore, hint-hint, I am working
on one to rule them all ;)
>
> Aaron Gray
> ---
> [Please don't say you've invented another UNCOL. -John]

John,

No I am not the man from UNCOL !

I am back working on my source to source compiler-compiler in the vein of YACC
but a real compiler-compiler not just a parser generator.

I am hopefully going to have all the main parser algorithms and some little
known ones and some new ones implemented. I have my Lexical Analyser Generator
LG implemented and an working on the Parser Generator PG, and an AST generator
AG, there are a few more tools and components to this. I am using algorithms
that are much simpler, clearer, and cleaner than the existing Flex, Bison, and
Byacc. I have literally implemented the algorithms from the Dragon Book and
even simplified them a bit, and an algorithm for equivalence classes my friend
invented, and am now working on the more complex "meta machine" algorithms.
Hopefully I will be able to parse all major languages.

I am working in C++ using nothing more complex than templates.  It is library
based with tools that use the library.

For example I am using the Dragon Book's Regular Expression direct to DFA
technique heres an example of the code :-

signed int DFA::GenerateRG2DFA(LexicalContext* context) {
	States states;
	State startState = states.newState(context->firstpos());

    this->accept[startState] = -1;
    std::deque<State> UnfinishedStates;

	UnfinishedStates.push_back(startState);

    while (!UnfinishedStates.empty()) {
        signed int accept = -1;
        State state = UnfinishedStates.front();
		UnfinishedStates.pop_front();
        State nextState;

		for (unsigned int input = 0; input < getNumberOfInputs(); ++input) {

			bitset followpos(context->getNumberOfPositions());

			for (bitset::iterator position = state.positions.begin(), end =
state.positions.end(); position != end; ++position) {
				if (position.isElement()) {
					if (context->move(position, input))
						followpos |= context->followpos(position);

					signed int action = context->getAction(position);
					if (action != -1 && (accept == -1 || (accept != -1 && action < accept)))
						accept = action;
				}
			}

			if (!followpos.isEmpty()) {
                if (!(nextState = states.findState(followpos)))
					UnfinishedStates.push_back(nextState = states.newState(followpos));
            }
            else
				nextState = State::NullState;

            (*table)[state.index - 1][input] = (isTerminalState(context,
followpos) ? -1 : 1) * nextState;
        } // end for inputs
        this->accept[state.index] = accept;
    } // end while (!W.empty())

    return startState;
}

Happy Christmas,

Aaron
[Oh, that's entirely reasonable.  A lot of the cruft in lex and yacc
and its descendants dates from the era when everyhing had to fit into
64K on a PDP-11.  I've never seen any reason to use LALR rather than
LR(1) if you have room for the tables. -John]

[toc] | [prev] | [standalone]


Back to top | Article view | comp.compilers


csiph-web