Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #189 > unrolled thread
| Started by | "Armel" <armelasselin@hotmail.com> |
|---|---|
| First post | 2011-07-05 01:02 +0200 |
| Last post | 2011-09-17 10:38 -0700 |
| Articles | 16 — 6 participants |
Back to article view | Back to comp.compilers
coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-07-05 01:02 +0200
Re: coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-07-07 10:28 +0200
coupling LALR with a scanner? "Karsten Nyblad" <uu3kw29sb7@snkmail.com> - 2011-07-07 10:46 +0200
Re: coupling LALR with a scanner? "Karsten Nyblad" <uu3kw29sb7@snkmail.com> - 2011-07-08 14:39 +0200
Re: coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-08-04 11:17 +0200
Re: coupling LALR with a scanner? Paul B Mann <paul@paulbmann.com> - 2011-09-13 13:38 -0700
Re: coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-09-16 10:47 +0200
Re: coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-09-19 13:52 +0200
Re: coupling LALR with a scanner? Paul B Mann <paul@paulbmann.com> - 2011-09-19 12:12 -0700
Re: coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-09-20 09:40 +0200
Re: coupling LALR with a scanner? Chris Dodd <cdodd@acm.org> - 2011-09-23 23:59 +0100
Re: coupling LALR with a scanner? Chris F Clark <cfc@shell01.TheWorld.com> - 2011-09-29 00:00 -0400
Re: coupling LALR with a scanner? "Armel" <armelasselin@hotmail.com> - 2011-10-02 16:41 +0200
Re: coupling LALR with a scanner? Chris F Clark <cfc@shell01.TheWorld.com> - 2011-10-03 11:59 -0400
Re: coupling LALR with a scanner? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-10-03 21:08 +0000
Re: coupling LALR with a scanner? Paul B Mann <paul@paulbmann.com> - 2011-09-17 10:38 -0700
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-07-05 01:02 +0200 |
| Subject | coupling LALR with a scanner? |
| Message-ID | <11-07-013@comp.compilers> |
Hi, I am exploring the possibility of choosing between several lexers depending on the current state of an LALR parser (i.e. if a state can accept productions such as A => A . 'a' b, B => . 'c', then the selected lexer will accept (at least) 'a' and 'c', for rightmost positions the lookahead symbols would be acceptable as well) Is there any literature about this? or example? I searched around for scannerless/lexerless parsing but could not make my mind. Regards Armel Asselin [The usual approach is to set flex start states in your yacc or bison parser. -John]
[toc] | [next] | [standalone]
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-07-07 10:28 +0200 |
| Message-ID | <11-07-015@comp.compilers> |
| In reply to | #189 |
>[The usual approach is to set flex start states in your yacc or bison >parser. >-John] the LALR generator is one of mine and the idea here would be to select automatically the right lexer from the currently accepted tokens. I have the feeling that it must be doable. The target is to be able to write grammars with such dependencies naturally without any (user level) grammar actions and very minimal lexer meta-information (such as which lexer produces which tokens). Armel [I think you will find that users hate a compiler where in each state the lexer only recognizes the tokens valid in that state, since the only error message it could produce is "invalid token." Also remember that the LA in LALR stands for Look Ahead, and the lexer is often one token ahead of the parser. -John]
[toc] | [prev] | [next] | [standalone]
| From | "Karsten Nyblad" <uu3kw29sb7@snkmail.com> |
|---|---|
| Date | 2011-07-07 10:46 +0200 |
| Message-ID | <11-07-016@comp.compilers> |
| In reply to | #191 |
>I am exploring the possibility of choosing between several lexers
>depending on the current state of an LALR parser (i.e. if a state can
>accept productions such as A => A . 'a' b, B => . 'c', then the
>selected lexer will accept (at least) 'a' and 'c', for rightmost
>positions the lookahead symbols would be acceptable as well)
The problem with using an LALR parser for that is that the state is not
enough information to know which token are accepted. An example:
A -> a B a
A -> b B b
B -> a
( A and B being non terminals and a and b being terminals.)
In an LALR parser you will get a state with just one item [ B -> a . ]
with a lookahead set of { a, b }. However, just one of a and b is
acceptable.
There are three ways to get around that problem:
First you can use an LR parser instead. Some parser generators
generate an LALR parser and use state splitting for solving conflicts
that come from LALR parsers being less powerful than LR parser. Such
generators can easily be modified to split states in this case too, in
practise generating an LR parser.
Secondly you can change the parser driver program, i.e., the program
that is generated by the parser generator. If it looks up information
in tables, it is easy to let the driver program, such that it finds out
which tokens are acceptable before asking the scanner for a token. You
test whether or not a token is acceptable by copying the parse stack,
place the token being tested in the window, and then parse until the
token is pushed on the stack or an error occurs. (Of course the first
indicates that the token is acceptable, and the second that it is not.)
Thirdly you can use a parser generator like Bison, that implements
general LR parsing. General LR parser generators work on multiple
stacks. You change the lexer such that it returns a set of tokens, a
token for each way the next tokens may be lexed. Say the lexed
returns three tokens. Then the driver program makes three copies of
the stacks, and parses on with each token being applied to its own
stack.
At last: You might get better answers if you are more detailed with
which problem you are trying to solve. I have been interested in such
techniques for parsing languages like PL/I where keywords may be used
as identifiers, Ada where a single quote may be a token itself or the
start of a character constant, and BCPL where line ends may end
statements. The best solution is not necessarily the same in all the
cases.
[toc] | [prev] | [next] | [standalone]
| From | "Karsten Nyblad" <uu3kw29sb7@snkmail.com> |
|---|---|
| Date | 2011-07-08 14:39 +0200 |
| Message-ID | <11-07-018@comp.compilers> |
| In reply to | #191 |
>the LALR generator is one of mine and the idea here would be to select >automatically the right lexer from the currently accepted tokens. I >have the feeling that it must be doable. The target is to be able to >write grammars with such dependencies naturally without any (user >level) grammar actions and very minimal lexer meta-information (such as >which lexer produces which tokens). You have not written if your generator supports LALR(k) or just LALR(1). In the later case my suggestion is that you change your generator to an LR(1) generator. Not only will you get an easy implementation of what you want. You will also catch syntax errors right after the offending token is put in the window, such that you can get better error reporting and recovery. Do not be afraid the parsers will be too big. That would have been the case 25 years ago, but not for the last 10 years. If you don't to go that way, then do as I suggested in my previous posting: Say you have a parse stack W and a terminal been pushed on the stack. Now you want to know if a terminal t is acceptable. Then you copy W onto a new stack W' and use W' to parse with t in the window. If t is stacked, then t is acceptable. My guess is that you will find this approach will make the parsers a bit slow. The later approach is so simple to implement, that I think somebody would have put it into some parser generator if it was a facility, that would scanning and parsing significantly easier. To me it looks like you have an idea for a solution with no problem.
[toc] | [prev] | [next] | [standalone]
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-08-04 11:17 +0200 |
| Message-ID | <11-08-004@comp.compilers> |
| In reply to | #194 |
"Karsten Nyblad" a icrit dans le message de groupe de discussion : > You have not written if your generator supports LALR(k) or just LALR(1). just LALR(1) or GLALR(1) >In the later case my suggestion is that you change your generator to an >LR(1) generator. I will soon do LR(1) or something in-between (e.g. IELR(1)) From what I could read in the work a Joel Denny about PSLR (pseudo scannerless LR) the approach is valid. It does not seem really hard to make something similar. Regards Armel
[toc] | [prev] | [next] | [standalone]
| From | Paul B Mann <paul@paulbmann.com> |
|---|---|
| Date | 2011-09-13 13:38 -0700 |
| Message-ID | <11-09-016@comp.compilers> |
| In reply to | #221 |
/* Test1 grammar (like the one above). */
G -> A <eof>
A -> a B c
-> a B d
B -> b
Test1 grammar is LR(0) and requires NO lookahead symbols.
By default, b always reduces to B.
/* Test2 grammar. */
G -> A <eof>
A -> a B1 c
-> a B2 d
B1 -> b
B2 -> b
Test2 grammar is LALR(1) and probably SLR(1) also. The parser
requires one symbol of lookahead to know which reduction to make (B1
or B2).
/* Test3 grammar. */
G -> A <eof>
A -> a B1 c
-> a B2 d
-> a B1 d
-> a B2 c
B1 -> b
B2 -> b
Test3 grammar is NOT LALR(1), but it is LR(1). The parser requires
one symbol of lookahead to know which reduction to make (B1 or B2).
Test3 is a highly contrived situation, not usually seen in real-world
computer languages (as far as I know). Test3 could be written as
Test2.
It seems that LALR(1) parsers are more powerful than some people
realize. LR(1) parser tables can be huge. A COBOL-85 grammar created
an LR(1) parser with over 2,000,000 states, whereas a IELR(1) parser
for that grammar created about 1668 states. Bison can create IELR(1)
parsers. HYACC can also create IELR(1) parsers.
I don't understand why you want to have a different scanner for each
state. The parser can easily make the decision, whether a token is
valid. In fact, an LALR parser already has this information in the
parser tables, so why make a simple situation complicated?
Also in case of a syntax error, the parser can tell the user what
tokens were expected at the error point. So there is no need for
different lexers, unless the syntax of the tokens changes from one
state to the next and this could be confusing to the the user of your
language.
Paul B Mann
[toc] | [prev] | [next] | [standalone]
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-09-16 10:47 +0200 |
| Message-ID | <11-09-017@comp.compilers> |
| In reply to | #266 |
>I don't understand why you want to have a different scanner for each >state. The parser can easily make the decision, whether a token is >valid. In fact, an LALR parser already has this information in the >parser tables, so why make a simple situation complicated? IELR was exactly made for that reason, as a first step to PSLR: some grammars have no 'tokens' and 'grammar rules', they just have a 'grammar' where mutually exclusive tokens are present, e.g. you cannot make a Javascript single lexer as there are state where / (slash) means 'start of regular expression' (of course the content of the regular expression follows totally different lexing rules than the rest of the text) whereas in other states it means 'division' operator. If your parser cannot tell which of the two lexers to use, you are off. > this could be confusing to the the user of your language. people like Perl and Javascript, so we have to make parsers for those languages :) by the way I'd be open to use IELR, but I have to read the paper again because it's far from being easy to understand and implement... Armel [This seems like an awfully complicated way to reinvent scanner start states. -John]
[toc] | [prev] | [next] | [standalone]
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-09-19 13:52 +0200 |
| Message-ID | <11-09-021@comp.compilers> |
| In reply to | #267 |
>[This seems like an awfully complicated way to reinvent scanner start
>states. -John]
In Fact, This Is The Same. there is a big difference in usage with start
states guided through Bison/Yacc explicit actions: the scanner knows by
itself which start state to use AND can tell you up front that what you
expect to do is impossible. when two mutually exclusive tokens happen in the
same parser state e.g.
A => expr "/" expr
| expr REGEXPR expr // imagining its possible in your language
B => expr "/" expr
expr => number | REGEXPR
REGEXPR = /[^/]*/
In case A the parser generator would issue by itself an error, in case B, it
would set automatically the lexer start states appropriately.
In both case it has a tremendous advantage: any developer can enter the
grammar, compile and have something working (or know why it won't work),
s/he does not need any more to be a scanner/parser guru.
By the way, nothing prevents start states with exclusive tokens from
being stuffed with all the other non-problematic tokens, so as to give
meaningful messages when an unexpected token happens.
Regards
Armel
[toc] | [prev] | [next] | [standalone]
| From | Paul B Mann <paul@paulbmann.com> |
|---|---|
| Date | 2011-09-19 12:12 -0700 |
| Message-ID | <11-09-022@comp.compilers> |
| In reply to | #267 |
> IELR was exactly made for that reason, as a first step to PSLR: some
> grammars have no 'tokens' and 'grammar rules', they just have a
> 'grammar' where mutually exclusive tokens are present, e.g. you
> cannot make a Javascript single lexer as there are state where /
> (slash) means 'start of regular expression' (of course the content
> of the regular expression follows totally different lexing rules
> than the rest of the text) whereas in other states it means
> 'division' operator. If your parser cannot tell which of the two
> lexers to use, you are off.
IELR(1) does not solve this problem. It solves the problem of reduce-
reduce conflicts in an LR(0) state machine where a state has multiple
"lookback nonterminal transitions" which cause conflicts. IELR(1)
does state splitting (or duplicating) in order to remove the conflicts
(if possible). It has nothing to do with how many scanners you may
need. It only pertains to the language defined by the grammar. It
looks like you have two languages or two lexical languages (and need
two scanners).
To show that IELR(1) solves this problem you would have to have an
LALR(1) grammar that has reduce-reduce conflicts caused by the need of
two different lexers (scanners) and show that IELR(1) removes the
conflicts.
What you described with Javascript is a little language inside of
another language. Why not switch parsers when a slash ('/') is
encountered if necessary?
Parsing expression grammars are more appropriate for this Javascript
problem, if you don't want to hand-code a "fix" within an LALR parser.
A similar situation occurs when parsing C. You encounter a "#if ..."
while parsing C. At the '#' you must leave the C parser and enter the
C- preprocessor parser. After the conditional statement has been
parsed, you must switch back to the C parser.
/ Paul Mann
[toc] | [prev] | [next] | [standalone]
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-09-20 09:40 +0200 |
| Message-ID | <11-09-023@comp.compilers> |
| In reply to | #272 |
"Paul B Mann" a icrit dans le message de groupe de discussion :
>> IELR was exactly made for that reason, as a first step to PSLR: some
>IELR(1) does not solve this problem. It solves the problem of reduce-
>reduce conflicts in an LR(0) [...]
OK I did not sentenced that right, this is PSLR which does all the
lexer/parser integration (hence Pseudo Scannerless LR).
>What you described with Javascript is a little language inside of
>another language. Why not switch parsers when a slash ('/') is
>encountered if necessary?
The problem is that / is used as a normal token AND as the start for regular
expression.
For example these are two valid sequences:
var regexpr = / b + a /
2;
var numeric_var = 1 / b + a /
2;
The 2; on the following lines will have two different meanings, in case
'regexpr', it is an instruction setting the expression value (as used in
eval() ), in the case second case, 2 is the second operand to a / operator.
I insist a bit on this to show there is no 'right side' heuristic to
determine / b + a / is a regular expression. On the other hand the '1'
expression could be as complex as you can imagine.
Of course in the presence of something like expr / expr the parser knows the
/ is for a division, and if the / is found where expr is expected it knows
there is a regular expression coming.
I don't see how I could hack this into a scanner only. maybe some specific
preceding tokens. I have pass again some time on that question.
Armel
[toc] | [prev] | [next] | [standalone]
| From | Chris Dodd <cdodd@acm.org> |
|---|---|
| Date | 2011-09-23 23:59 +0100 |
| Message-ID | <11-09-029@comp.compilers> |
| In reply to | #273 |
"Armel" <armelasselin@hotmail.com> wrote in news:11-09-023@comp.compilers:
> Of course in the presence of something like expr / expr the parser knows
> the / is for a division, and if the / is found where expr is expected it
> knows there is a regular expression coming.
> I don't see how I could hack this into a scanner only. maybe some
> specific preceding tokens. I have pass again some time on that question.
The usual way to do this in a flex+bison parser is with a production like:
expr: { beginREGEXP(); } '/' regexp { endREGEXP(); } '/'
where beginREGEXP and endREGEXP are functions in the flex file that set
the start-state into and out of regexp mode. Note that the actions need
to be BEFORE the parser shifts the '/' token, as an LALR(1) parser may look
ahead one token. Which means that a '/' must be a '/' regardless of the
scanner state.
Note that if you allow additional lookahead in the parser (LR(k) or LR(*) or
any kind of backtracking) this will break badly.
Chris Dodd
cdodd@acm.org
[toc] | [prev] | [next] | [standalone]
| From | Chris F Clark <cfc@shell01.TheWorld.com> |
|---|---|
| Date | 2011-09-29 00:00 -0400 |
| Message-ID | <11-10-003@comp.compilers> |
| In reply to | #273 |
I have come to this discussion a little late and may have missed some of the conversation, so excuse me if I say some things which are obvious and have been discussed before. There are classes of languages where you can use the "parser" context to sufficiently disambiguate which characters should be grouped into tokens. Thus, the class of scannerless LR parsers make sense and there are numerous papers discussing them. It does in general make the front end writer's life simpler in that the programmer doesn't have to manually distinguish the lexical grammar from the parsing grammar. Moreover, several conventions like automatically dealing with whitespace and comments can make this set much larger and much more useful. It might even be possible to argue that the scannerless technology might result in better error messages because a lexical error is promoted to a parsing error and more context can be used to display the correct possibilites (if one assumes that the parser can generate intelligible error messages to begin with, which is often a dubious claim for most automated parsing systems). However, and here I think we come to the crux of the discussion, can a scannerless system disambiguate the potentially ambiguous use of / as a division operator and the start of a regular expression? I would argue that if the system is LR(k) for some finite k and the use of regular expressions is not restricted to some narrow (left, preceding) context (e.g. as in Perl where regular expressions are part of only specific operations like m/regex/ and ~=/regex/ (forgive me if my Perl is a bit rusty)), then the answer is no. The reason why not is simple, at some point one has to decide whether to tokenize a / as a division operator or to use it as introducing a regular expression. With an LR(k) parser, one must make the decision after at most k tokens, but a regular expression can be arbitrarily long, thus one can "easily" formulate a regular expression which is ambiguous with only k tokens of lookahead. Note, I'm positing that tokenization is equivalent to a reduction. However, I think you will find that that is generally true. Note a scannerless system based upon a more general language class (e.g. Earley parsing) could use an arbitrary amount of right, succeeding context to disambiguate the tokenization and avoid this problem. However, the runtime cost of doing so could easily become prohibitive. Alternately, one could design the language to restrict regular expressions to certain contexts, as I believe Perl does, and thus make the cases where one is looking for them unambiguous or atleast much simpler. (To me this would be the hallmark of a good language design, as ambiguity often confuses the programmer as well as the compilers, but that is a digression.) If the language design has the characteristics that regular expressions are restricted to certain well-defined contexts, the ones where lexer start states in a grammar would work, then there is no specific reason why a scannerless system could not find and insert those start states automatically. In theory, then a non-compiler writing guru could use such a tool to work on the grammar without being aware of what is going on under the covers. However, my experience suggests the result will be far less sanguine. Not only would such a tool that further abstracts the grammar writer from the details of the process not really enable non-gurus to successfully write grammars for such subtle cases, but it will lull them into falsely believing that they can write grammars without learning the underlying theory and then when the tool produces an error message they will be so much more confused as to what works and what doesn't that the result will become even more of a black-art with magic incantations and spells that are recited without understanding on the belief that such superstitions enable one to write correct grammars. Despite having written compilers for ~40 years now, I still find myself falling back on overly simplified interpretations on why certain things work and others don't. To wit the hand-waving explanation I offered for context sensitivity above. Although my intuition about context sensitivity is slightly more subtle than what I expressed, it is hard to explain rigorously or precisely. Of course, I don't expect my annecdotal evidence to be convincing so I submit how well C++ template errors are handled. How many people do you know who can write and really understand the creation of non-trivial C++ templates? I would argue that grammar writing is of roughly the same complexity as template writing. Note, I think this also explains the popularity of LL parsing, as the theory of what works in LL is much simpler and manageable because it is so much more restrictive than LR parsing while still being adequate (once one accepts the if-then-else trick) to handle most if not all programming languages. It does take more effort to write such grammars, but novices can quickly write simple grammars in them. In contrast, many users simply give up when trying to resolve reduce- reduce conflicts (or even shift-reduce ones). That is not to say that scannerless parser generators are a bad idea. It is simply to disabuse the notion that it will instantly enable less skilled writers to more successfully develop subtle and complicated grammars, such as those needed to handle ambiguous tokenizations. Over time as we get experienced using them, possibly yes, but out-of-the-box no. Moreover, I also think that understanding the value of a separate lexer and parser is a good thing. The implementation gap from divorcing those two aspects actually has utility. In fact, as I have been studying "high speed regular expression processing", I have determined that even within a regular expression processor (e.g. a lexer) there is an argument for introducing such a gap to "segment" tokens into smaller chunks for both performance and expressibility reasons. At the same time, I think that the segmentation process can be automated, at least in many cases. Hope this helps, -Chris ****************************************************************************** Chris Clark email: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. Web Site: http://world.std.com/~compres 23 Bailey Rd voice: (508) 435-5016 Berlin, MA 01503 USA twitter: @intel_chris
[toc] | [prev] | [next] | [standalone]
| From | "Armel" <armelasselin@hotmail.com> |
|---|---|
| Date | 2011-10-02 16:41 +0200 |
| Message-ID | <11-10-005@comp.compilers> |
| In reply to | #285 |
> there is an argument for introducing such a gap to "segment" > tokens into smaller chunks for both performance and expressibility > reasons. could you elaborate on this segmentation mechanism? In my lexer generator, the developer can introduce start states by himself and 'cut' complex expressions into smaller expressions which still respect AFD capabilities and introduce dynamic regular expressions where absolutely necessary, for languages allowing dynamic string delimiters for example (e.g. doc-strings like << END_OF_STR_MARKER, some lines then a line with END_OF_STR_MARKER only on the line), languages such as ruby are very funny from that point of view if I remember well. Regards Armel
[toc] | [prev] | [next] | [standalone]
| From | Chris F Clark <cfc@shell01.TheWorld.com> |
|---|---|
| Date | 2011-10-03 11:59 -0400 |
| Message-ID | <11-10-006@comp.compilers> |
| In reply to | #287 |
"Armel" <armelasselin@hotmail.com> writes: >> there is an argument for introducing such a gap to "segment" >> tokens into smaller chunks for both performance and expressibility >> reasons. > > could you elaborate on this segmentation mechanism? I would like to, but the idea is going through the patent process, as it has hardware design implications. Once I know that the patent has been filed, or get other ok from Intel legal, I will write the idea up. > In my lexer generator, the developer can introduce start states by himself > and 'cut' complex expressions into smaller expressions which still respect > AFD capabilities and introduce dynamic regular expressions where absolutely > necessary, for languages allowing dynamic string delimiters for example > (e.g. doc-strings like << END_OF_STR_MARKER, some lines then a line with > END_OF_STR_MARKER only on the line), languages such as ruby are very funny > from that point of view if I remember well. Those sound like useful extensions. In the hands of an experienced parser writer, that could make the process much easier. Let the system figure out what it can, then adjust where you must.
[toc] | [prev] | [next] | [standalone]
| From | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
|---|---|
| Date | 2011-10-03 21:08 +0000 |
| Message-ID | <11-10-008@comp.compilers> |
| In reply to | #285 |
Chris F Clark <cfc@shell01.theworld.com> wrote:
> I have come to this discussion a little late and may have missed some
> of the conversation, so excuse me if I say some things which are
> obvious and have been discussed before.
(snip)
> However, and here I think we come to the crux of the discussion, can a
> scannerless system disambiguate the potentially ambiguous use of / as
> a division operator and the start of a regular expression? I would
> argue that if the system is LR(k) for some finite k and the use of
> regular expressions is not restricted to some narrow (left, preceding)
> context (e.g. as in Perl where regular expressions are part of only
> specific operations like m/regex/ and ~=/regex/ (forgive me if my Perl
> is a bit rusty)), then the answer is no.
I could be completely missing the point, but it seems to me that you
don't have to restrict such RE to a narrow context as long as the
context isn't one where a division operator is allowed.
Division usually requires a preceding operand. In the contexts I would
expect a regular expression it would be either preceded by an operator
or be the beginning of an expression.
In writing AWK programs, where there are functions that take RE as
operands, I normally put then in quotes (") instead of slashes, as it
seems more obvious in the context. (A variable with the appropriate
value also works.)
-- glen
[toc] | [prev] | [next] | [standalone]
| From | Paul B Mann <paul@paulbmann.com> |
|---|---|
| Date | 2011-09-17 10:38 -0700 |
| Message-ID | <11-09-019@comp.compilers> |
| In reply to | #266 |
> /* Test3 grammar. */
>
> G -> A <eof>
> A -> a B1 c
> -> a B2 d
> -> a B1 d
> -> a B2 c
> B1 -> b
> B2 -> b
>
I made a mistake, sorry. This grammar should be:
/* Test3 grammar. */
G -> A <eof>
A -> a B1 c
-> a B2 d
-> x B1 d
-> x B2 c
B1 -> b
B2 -> b
Paul B Mann
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web