Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2637

"Bootstrapping yacc in yacc" -> "Bootstrapping yacc in lex"!

Path csiph.com!xmission!usenet.csail.mit.edu!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From Rock Brentwood <rockbrentwood@gmail.com>
Newsgroups comp.compilers
Subject "Bootstrapping yacc in yacc" -> "Bootstrapping yacc in lex"!
Date Sun, 14 Mar 2021 17:54:23 -0700 (PDT)
Organization Compilers Central
Lines 36
Sender news@iecc.com
Approved comp.compilers@iecc.com
Message-ID <21-03-004@comp.compilers> (permalink)
Mime-Version 1.0
Content-Type text/plain; charset="UTF-8"
Content-Transfer-Encoding 8bit
Injection-Info gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="46330"; mail-complaints-to="abuse@iecc.com"
Keywords yacc, comment
Posted-Date 14 Mar 2021 21:10:31 EDT
X-submission-address compilers@iecc.com
X-moderator-address compilers-request@iecc.com
X-FAQ-and-archives http://compilers.iecc.com
Xref csiph.com comp.compilers:2637

Show key headers only | View raw


It's a recurrent question that's come up in other forums "can yacc be
bootstrapped in yacc?" Now, I'm adding a twist.

I'll repeat one of my recent replies here. In the syntax for yacc files, laid
out by the POSIX standard, there is no mandatory semi-colon at the ends of
rules, so an extra look-ahead is required to determine whether an identifier
is followed by a colon. If so, then this indicates the left-hand side of a new
rule.

A grammar rule has the form

left-hand-side ":" stuff on the right optional ";"'s.

If you see a ":" in the middle of the rules on the right, then you've actually
sneaked on over into the *next* rule.

Bison hacks the syntax, by making left-hand-side + ":" into a single token.

It may, in fact, be possible to parse yacc *even with* this issue, without
having to hack yacc grammar like bison did - by just simply not using yacc at
all, but using only lex!

The grammar specified by POSIX is actually a *regular* grammar that specifies
a finite state transducer. The transducer is deterministic, precisely because
the identifier-colon ambiguity can be resolved. Therefore, if you set up the
right kind of finite state machine in lex making use of lex's start condition
facility in the right way, then you should be able to parse a yacc grammar
with lex and to bootstrap an implementation of yacc with just lex.

A specification possibly suitable for utilities like lex (using UNICODE, in
UTF-8 format) - working off the POSIX syntax - can be found in comp.theory
here
https://groups.google.com/g/comp.theory/c/jSkl9ey7iM8

It comp.compilers can handle UTF-8, I can repeat it here, as well.
[Yes, messages are in UTF-8. -John]

Back to comp.compilers | Previous | NextNext in thread | Find similar


Thread

"Bootstrapping yacc in yacc" -> "Bootstrapping yacc in lex"! Rock Brentwood <rockbrentwood@gmail.com> - 2021-03-14 17:54 -0700
  Re: "Bootstrapping yacc in yacc" -> "Bootstrapping yacc in lex"! Kaz Kylheku <563-365-8930@kylheku.com> - 2021-03-15 02:37 +0000

csiph-web