Path: csiph.com!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Ev Drikos <drikosev@gmail.com>
Newsgroups: comp.compilers
Subject: Re: Has lexing and parsing theory advanced since the 1970's?
Date: Wed, 29 Sep 2021 05:07:52 -0700 (PDT)
Organization: Compilers Central
Lines: 34
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-09-015@comp.compilers>
References: <21-09-008@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="74987"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, history
Posted-Date: 29 Sep 2021 17:10:44 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-09-008@comp.compilers>
Xref: csiph.com comp.compilers:2708

On Thursday, September 16, 2021 at 7:56:25 PM UTC+3, Roger L Costello wrote:
>
> That said, Flex & Bison is old. Has lexing/parsing theory advanced since the
> 1970’s? If yes, are there parser generators available today which are based on
> those advances in lexing/parsing theory? Or does Flex & Bison still represent
> the state-of-the-art in terms of the underlying theory it uses?

Hello,

The routines that recognize tokens are still called scanners and those
that parse the input are still called parsers. That explained, this long
list may give a clue of what an answer to your last question shall be:
https://en.wikipedia.org/wiki/Comparison_of_parser_generators

Yet, I haven't used most of them. So, I'll give you an example with
Syntaxis, a tool I've coded that isn't included in the above list.

If we try to parse this erroneous Fortran line with the command 'fcheck'
(binary available at https://github.com/drikosev/Fortran) we see that
the expected tokens in the error message contain both a '=' and a name:

program ? ; end

Note that the command 'fcheck' uses a deterministic parser (built by
Syntaxis) and the expected tokens in an error message are pre-computed.

To my knowledge, the ability of a parser to shift simultaneously two
distinct terminals in one transition isn't an advancement in theory but
I guess several tools mentioned in the Wikipedia list above possibly
provide similar or better goodies (ie Chris Clark described an ANTLR4
feature that advances theory directly).

Regards,
Ev. Drikos