Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them?

From	Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Newsgroups	comp.compilers
Subject	Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them?
Date	2022-06-10 12:26 +0200
Organization	Compilers Central
Message-ID	<22-06-032@comp.compilers> (permalink)
References	<22-06-023@comp.compilers>

Show all headers | View raw

On 6/9/22 4:52 PM, Roger L Costello wrote:

> Those seem like compelling reasons for separating the lexical analysis from
> parsing, so why does ANTLR not do so; i.e., why does ANTLR combine them?

I cannot answer the ANTLR question but want to point out why IMO
(traditional) languages based on tokens should have a tokenizer in the
first place.

We encountered a problem with scannerless parsers and traditional
programming languages in Meta§. A tokenizer can use a couple of
whitespace or control characters or tokens as token separators. Without
such a rule it's very hard to flag all places in a grammar where
whitespace is *required* as a token separator.

As an example: in an "else if" sequence whitespace is required between
"else" and "if" while in other context e.g. "else{" whitespace is not
required. This problem may be solved by a regex, but why should a
grammar be inflated by adding a token termination clause to *each* keyword?

Fortran is another example where keyword recognition requires
backtracking or similar means due to ignorance of spaces e.g. in the
well known "DO10I = 1," snippet.

DoDi
[A separate lexer also makes it a lot easier to skip comments.  For Fortran
you had to do a prepass to see whether a statement was an assignment or something
else but after that you could tokenize without backtracking. -John]

Thread

The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? Roger L Costello <costello@mitre.org> - 2022-06-09 14:52 +0000
  Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? "Alexei A. Frounze" <alexfrunews@gmail.com> - 2022-06-09 18:07 -0700
  Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? gah4 <gah4@u.washington.edu> - 2022-06-09 23:01 -0700
  Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-06-10 12:26 +0200
  The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-06-11 23:45 +0300
  Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? George Neuner <gneuner2@comcast.net> - 2022-06-11 18:15 -0400
    Re: The dragon book says separating lexical analysis and parsing is beneficial, so why doesn't ANTLR separate them? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2022-06-12 14:10 +0000

csiph-web