Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #3001

Simple Lexer and Simple Parser [ was RE: Flex is the most powerful lexical analysis language in the world. True or False? ]

From Roger L Costello <costello@mitre.org>
Newsgroups comp.compilers
Subject Simple Lexer and Simple Parser [ was RE: Flex is the most powerful lexical analysis language in the world. True or False? ]
Date 2022-05-08 13:34 +0000
Organization Compilers Central
Message-ID <22-05-022@comp.compilers> (permalink)
References <22-05-003@comp.compilers> <22-05-007@comp.compilers> <22-05-009@comp.compilers> <22-05-018@comp.compilers>

Show all headers | View raw


Thank you again Chris. Terrific information.

Another question if I may. You wrote:

> And that goes to an important point.  Your lexer *should be* almost
> trivially simple (i.e. regular expressions only and not complicated
> ones).  You rarely want to solve problems at the lexical level.  You
> are much less likely to get good error reporting if you do.  In most
> cases, your parser should be simple also.

For a while now I have been (for fun) working on building a parser for
parsing XML documents. I have experimented with making the lexer
simple and with making the parser simple. If I make the lexer simple,
then the parser is complex. If I make the lexer complex (using lots of
states and making heavy use of Flex's pushdown stack) then the parser
is simple. It doesn't seem possible to make both the lexer and parser
simple.

There are lots of "conditional rules" in XML. For example, in XML the
&amp; is called an "XML entity." Since the & is a reserved symbol, XML
documents need to use &amp; instead of &. An XML parser is to convert
&amp; to &. However, if the &amp; is in certain contexts -- within a
comment or within a CDATA section -- then the &amp; is not converted.
Thus, there is conditional processing:

IF (&amp; is in a comment or in a CDATA section) THEN
    OUTPUT(&amp;)
ELSE
   OUTPUT(&)

Flex's states/stack mechanism is ideally suited for conditional
processing like this. From the section on Start Conditions in the Flex
manual: "flex provides a mechanism for conditionally activating
rules."

So while it would be great to have a simple lexer, I am leaning
towards dealing with the conditional rules in XML using the Flex
states/stack mechanism rather than dealing with the conditional rules
in Bison. In other words, I am leaning towards a complex lexer.

I am interested in hearing your thoughts on this.

> You don't need a flamethrower

My apologies. It wasn't my intent to throw a flame. But in hindsight I
can see that I should have worded things much better. I will do better
in the future.

/Roger

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Flex is the most powerful lexical analysis language in the world. True or False? Roger L Costello <costello@mitre.org> - 2022-05-04 11:22 +0000
  Re: Flex is the most powerful lexical analysis language in the world. True or False? Tom Shields <thomas.evans.shields@gmail.com> - 2022-05-04 14:14 -0500
  Flex is the most powerful lexical analysis language in the world. True or False? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-05-05 15:20 +0300
    RE: Flex is the most powerful lexical analysis language in the world. True or False? Roger L Costello <costello@mitre.org> - 2022-05-06 11:16 +0000
      RE: Flex is the most powerful lexical analysis language in the world. True or False? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-05-07 13:15 +0300
        Simple Lexer and Simple Parser [ was RE: Flex is the most powerful lexical analysis language in the world. True or False? ] Roger L Costello <costello@mitre.org> - 2022-05-08 13:34 +0000
  Re: Flex is the most powerful lexical analysis language in the world. True or False? George Neuner <gneuner2@comcast.net> - 2022-05-06 11:00 -0400
    Re: Flex is the most powerful lexical analysis language in the world. True or False? gah4 <gah4@u.washington.edu> - 2022-05-06 14:30 -0700
      Re: fun with Postscript, was Flex is the most powerful lexical analysis language in the world. True or False? gah4 <gah4@u.washington.edu> - 2022-05-07 13:10 -0700

csiph-web