Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #343 > unrolled thread
| Started by | Borneq <a.moderacja@gmail.com> |
|---|---|
| First post | 2011-11-20 08:48 -0800 |
| Last post | 2011-12-07 06:29 -0800 |
| Articles | 14 — 7 participants |
Back to article view | Back to comp.compilers
How detect cycle in grammar ? Borneq <a.moderacja@gmail.com> - 2011-11-20 08:48 -0800
Re: How detect cycle in grammar ? Hans Aberg <haberg-news@telia.com> - 2011-11-21 18:14 +0100
Re: How detect cycle in grammar ? Gene <gene.ressler@gmail.com> - 2011-11-21 10:20 -0800
Re: How detect cycle in grammar ? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-11-22 15:20 +0000
Re: How detect cycle in grammar ? Quinn Tyler Jackson <quinn_jackson2004@yahoo.ca> - 2011-11-25 22:28 -0800
Re: How detect cycle in grammar ? Gene <gene.ressler@gmail.com> - 2011-11-27 10:18 -0800
Re: How detect cycle in grammar ? Borneq <a.moderacja@gmail.com> - 2011-11-23 12:56 -0800
Re: How detect cycle in grammar ? Borneq <a.moderacja@gmail.com> - 2011-11-24 09:24 -0800
Re: How detect cycle in grammar ? Gene <gene.ressler@gmail.com> - 2011-11-27 10:27 -0800
Re: How detect cycle in grammar ? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2011-11-28 16:46 +0000
Re: How detect cycle in grammar ? glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2011-11-29 07:31 +0000
Re: How detect cycle in grammar ? Paul B Mann <paul@paulbmann.com> - 2011-12-01 02:46 -0800
Re: How detect cycle in grammar ? Quinn Tyler Jackson <quinn_jackson2004@yahoo.ca> - 2011-12-02 09:20 -0800
Re: How detect cycle in grammar ? Gene <gene.ressler@gmail.com> - 2011-12-07 06:29 -0800
| From | Borneq <a.moderacja@gmail.com> |
|---|---|
| Date | 2011-11-20 08:48 -0800 |
| Subject | How detect cycle in grammar ? |
| Message-ID | <11-11-041@comp.compilers> |
A->B B->B This grammar is not correct, B is looped. A->B B->C C->A This another grammar, cycle can be arbitrarily long. Is cycle when First(Nonterminal) not contain any terminal, even not epsilon? A->B B-> (=B->epsilon) in first A and B is epsilon
[toc] | [next] | [standalone]
| From | Hans Aberg <haberg-news@telia.com> |
|---|---|
| Date | 2011-11-21 18:14 +0100 |
| Message-ID | <11-11-043@comp.compilers> |
| In reply to | #343 |
On 2011/11/20 17:48, Borneq wrote: > A->B > B->B > This grammar is not correct, B is looped. > > A->B > B->C > C->A > This another grammar, cycle can be arbitrarily long. > Is cycle when First(Nonterminal) not contain any terminal, even not > epsilon? If it is only cycles in the graph you are out for, perhaps some algorithm mentioned here might help: https://en.wikipedia.org/wiki/Strongly_connected_component https://en.wikipedia.org/wiki/Cycle_detection Hans
[toc] | [prev] | [next] | [standalone]
| From | Gene <gene.ressler@gmail.com> |
|---|---|
| Date | 2011-11-21 10:20 -0800 |
| Message-ID | <11-11-045@comp.compilers> |
| In reply to | #343 |
On Nov 20, 5:48 pm, Borneq <a.modera...@gmail.com> wrote: > A->B > B->B > This grammar is not correct, B is looped. > > A->B > B->C > C->A > This another grammar, cycle can be arbitrarily long. > Is cycle when First(Nonterminal) not contain any terminal, even not > epsilon? > > A->B > B-> (=B->epsilon) > in first A and B is epsilon Cycles aren't the problem. You'll need cycles to represent lists and nestings. Nonterminals that can never derive a terminal string are the problem. You should look up useless nonterminals in the Dragon Book or similar reference. This includes the algorithm for removing them from grammars. As I recall the algorithm starts with the lhs's of rules that expand entirely to terminals (including epsilon) and recursively marks nonterminals that have a rule where the entirely right hand side is either marked or terminal. When you're done marking in this manner, the unmarked nonterminals are useless. All rules involving them can be deleted without changing the represented language. If the start symbol is unmarked, the language is empty.
[toc] | [prev] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
|---|---|
| Date | 2011-11-22 15:20 +0000 |
| Message-ID | <11-11-046@comp.compilers> |
| In reply to | #347 |
Gene <gene.ressler@gmail.com> writes: >Nonterminals that can never derive a terminal string are the >problem. Is it really? Since they cannot derive a terminal, they have no influence on the language described by the grammar. They might just as well not be there. Are they really a problem (except for certain implementation techniques)? - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [next] | [standalone]
| From | Quinn Tyler Jackson <quinn_jackson2004@yahoo.ca> |
|---|---|
| Date | 2011-11-25 22:28 -0800 |
| Message-ID | <11-11-052@comp.compilers> |
| In reply to | #348 |
On Tue, Nov 22, 2011 at 7:20 AM, Anton Ertl
<anton@mips.complang.tuwien.ac.at> wrote:
>
> Gene <gene.ressler@gmail.com> writes:
> >Nonterminals that can never derive a terminal string are the
> >problem.
>
> Is it really? Since they cannot derive a terminal, they have no
> influence on the language described by the grammar. They might just
> as well not be there. Are they really a problem (except for certain
> implementation techniques)?
Imagine a world of parsing where the common wisdom (nonterminals that
do not derive a terminal have no influence on the grammar) does not
apply, and then try to imagine writing a parser generator for such a
language. Welcome to adaptive grammars. In an adaptive grammar, a rule
that (eventually or directly) derives a terminal one instant may or
may not derive one the next, and if it does, it may not derive the
same terminal(s) it used to, and this may vary over time (global
state) or space (local state).
S ::= b ":" y<a>;
b ::= <</ z x />> #super; // match the longest of z or x
a ::= '[a-z]+';
x ::= $n|=(a);
y ::= x.n;
z ::= "orange";
The above grammar (which varies over space, not time -- and is
therefore declarative rather than imperative) generates the language L
= {w:w | w=a string of 1 or more letters in length, and w != the
string orange}.
That is, in one very special case y does not derive a terminal, and is
"useless" but in every other case, it does derive a terminal.
Quinn
[toc] | [prev] | [next] | [standalone]
| From | Gene <gene.ressler@gmail.com> |
|---|---|
| Date | 2011-11-27 10:18 -0800 |
| Message-ID | <11-11-056@comp.compilers> |
| In reply to | #348 |
On Nov 22, 10:20 am, an...@mips.complang.tuwien.ac.at (Anton Ertl) wrote: > Gene <gene.ress...@gmail.com> writes: > >Nonterminals that can never derive a terminal string are the > >problem. > > Is it really? Since they cannot derive a terminal, they have no > influence on the language described by the grammar. They might just > as well not be there. Are they really a problem (except for certain > implementation techniques)? The algorithm is commonly used to print warnings in parser generators. Check recent releases of bison, for example.
[toc] | [prev] | [next] | [standalone]
| From | Borneq <a.moderacja@gmail.com> |
|---|---|
| Date | 2011-11-23 12:56 -0800 |
| Message-ID | <11-11-049@comp.compilers> |
| In reply to | #347 |
On 21 Lis, 19:20, Gene <gene.ress...@gmail.com> wrote: > As I recall the algorithm starts with the lhs's of rules that expand > entirely to terminals (including epsilon) and recursively marks > nonterminals that have a rule where the entirely right hand side is > either marked or terminal. When you're done marking in this manner, > the unmarked nonterminals are useless. All rules involving them can > be deleted without changing the represented language. If the start > symbol is unmarked, the language is empty. How works it algorithm, as I see: - We have two marks, one for rules one for nonterminals - Mark rule when all on the right is termianal or marked nontermianal - Mark nonterminal only when all its rules is marked? (what when we remove rule?) A->A A->B B->b We start from last rule. Mark B because all rhs ale terminal (b) and mark rule "B->b" as accessible. A->B - right side all are marked, A not marked yet because has more than one rule A->B - rule not marked
[toc] | [prev] | [next] | [standalone]
| From | Borneq <a.moderacja@gmail.com> |
|---|---|
| Date | 2011-11-24 09:24 -0800 |
| Message-ID | <11-11-050@comp.compilers> |
| In reply to | #347 |
On 21 Lis, 19:20, Gene <gene.ress...@gmail.com> wrote: > As I recall the algorithm starts with the lhs's of rules that expand > entirely to terminals (including epsilon) and recursively marks > nonterminals that have a rule where the entirely right hand side is > either marked or terminal. When you're done marking in this manner, > the unmarked nonterminals are useless. All rules involving them can > be deleted without changing the represented language. If the start > symbol is unmarked, the language is empty. I found http://programming4.us/desktop/408.aspx section "4 Reduction of Grammar" I interpret this: we mark nonterminal as usable if any its production contains only terminals or marked previously nonterminals. But what if nonterminal has one using production and one useless? For example A->A A->B B->b How eliminate A->A ?
[toc] | [prev] | [next] | [standalone]
| From | Gene <gene.ressler@gmail.com> |
|---|---|
| Date | 2011-11-27 10:27 -0800 |
| Message-ID | <11-11-057@comp.compilers> |
| In reply to | #352 |
On Nov 24, 12:24 pm, Borneq <a.modera...@gmail.com> wrote: > On 21 Lis, 19:20, Gene <gene.ress...@gmail.com> wrote: > > > As I recall the algorithm starts with the lhs's of rules that expand > > entirely to terminals (including epsilon) and recursively marks > > nonterminals that have a rule where the entirely right hand side is > > either marked or terminal. When you're done marking in this manner, > > the unmarked nonterminals are useless. All rules involving them can > > be deleted without changing the represented language. If the start > > symbol is unmarked, the language is empty. > > I foundhttp://programming4.us/desktop/408.aspxsection "4 Reduction > of Grammar" > I interpret this: we mark nonterminal as usable if any its production > contains only terminals or marked previously nonterminals. > But what if nonterminal has one using production and one useless? For > example > A->A > A->B > B->b > How eliminate A->A ? Once you know the useless nonterminals, you can delete any rule with a right hand side consisting entirely of them. It's also worth noting that any grammar with A->A in it is ambiguous in a way that should cause LR table generation to fail. How would a parser know how many times to reduce A->A?
[toc] | [prev] | [next] | [standalone]
| From | anton@mips.complang.tuwien.ac.at (Anton Ertl) |
|---|---|
| Date | 2011-11-28 16:46 +0000 |
| Message-ID | <11-11-066@comp.compilers> |
| In reply to | #359 |
Gene <gene.ressler@gmail.com> writes: >It's also worth noting that any grammar with A->A in it is ambiguous >in a way that should cause LR table generation to fail. How would a >parser know how many times to reduce A->A? S -> A|B A -> A B -> t This grammar can derive only the word "t", and only in one way. I see no ambiguity; no word can be derived in two ways in this grammar. As for warning about these useless cycles, sure, they may indicate a programming mistake, so better warn about them. I still don't see that they are a problem as far as their meaning is concerned, nor wrt. implementability in general. Apparently the Bison implementors thought so, too, or Bison would produce an error, not a warning. - anton -- M. Anton Ertl anton@mips.complang.tuwien.ac.at http://www.complang.tuwien.ac.at/anton/
[toc] | [prev] | [next] | [standalone]
| From | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
|---|---|
| Date | 2011-11-29 07:31 +0000 |
| Message-ID | <11-11-068@comp.compilers> |
| In reply to | #368 |
Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote: > Gene <gene.ressler@gmail.com> writes: >>It's also worth noting that any grammar with A->A in it is ambiguous >>in a way that should cause LR table generation to fail. How would a >>parser know how many times to reduce A->A? > S -> A|B > A -> A > B -> t > This grammar can derive only the word "t", and only in one way. I see > no ambiguity; no word can be derived in two ways in this grammar. > As for warning about these useless cycles, sure, they may indicate a > programming mistake, so better warn about them. For this case, it would seem easy for any program to figure out that it was a useless recursion. If you make a longer cycle of useless recursion, then it might be harder to detect, but it still shouldn't be so hard. Now, you could also have: D -> D t or E -> t E which are also useless, in that they require an infinite number of t's to match, so again I would hope that they could be detected. For complicated grammars, it isn't easy for people to track all these down, but it is easy for programs to do it. -- glen h
[toc] | [prev] | [next] | [standalone]
| From | Paul B Mann <paul@paulbmann.com> |
|---|---|
| Date | 2011-12-01 02:46 -0800 |
| Message-ID | <11-12-004@comp.compilers> |
| In reply to | #370 |
The LRSTAR parser generator reports two errors in this grammar: S -> A|B A -> A B -> t LRSTAR Basic 3.0.137 Copyright 2011 Compilerware. Input silly.grm silly.grm(6) : A -> A silly.grm(6) : ---^ silly.grm(6) : Useless production. silly.grm(6) : A silly.grm(6) : ---^ silly.grm(6) : Nonterminal symbol in cycle, cannot derive anything. 0 min 0.000 sec, 0.728 MB, 0 warnings, 2 errors.
[toc] | [prev] | [next] | [standalone]
| From | Quinn Tyler Jackson <quinn_jackson2004@yahoo.ca> |
|---|---|
| Date | 2011-12-02 09:20 -0800 |
| Message-ID | <11-12-008@comp.compilers> |
| In reply to | #376 |
On Thu, Dec 1, 2011 at 2:46 AM, Paul B Mann <paul@paulbmann.com> wrote: > The LRSTAR parser generator reports two errors in this grammar: > > S -> A|B > A -> A > B -> t > > > LRSTAR Basic 3.0.137 Copyright 2011 Compilerware. > > silly.grm(6) : Useless production. > > silly.grm(6) : Nonterminal symbol in cycle, cannot derive anything. > > 0 min 0.000 sec, 0.728 MB, 0 warnings, 2 errors. The Meta-S parser generator reports two errors: eV005: Cyclic Recursion (on A->A since Meta-S is top-down, A->A is illegal cyclic recursion) eV008: Class expected as parameter (on S->A|B, since A did not compile, and therefore A|B does not resolve A to any valid rule) A variation of the above grammar that generates the same errors is: > S -> A|B > A -> [B] A > B -> t Because A obviously expands to: A -> B A A -> A Which ends up in the same left-recursive hole. A -> C A C -> [B] of course generates the same error because effectively it's just another way to say A -> [B] A and can lead to potential recursion. I suppose the reason I'm bringing this up is that I mentioned earlier that in adaptive grammars it is possible to write grammars that have rules that are not known until parse-time to derive terminals on any given rule. Because the parse is top down, however, A -> A is a case of a rule that generates an error because it is malformed per the Adaptive-K algorithm. That said: S -> A|B A -> C A C -> x.y B -> "t" Is a case where x.y *may* contain the empty string at parse time, or may not (depending on input), and so is not thrown away with an error. This is where writing adaptive grammars becomes a creative exercise. Quinn
[toc] | [prev] | [next] | [standalone]
| From | Gene <gene.ressler@gmail.com> |
|---|---|
| Date | 2011-12-07 06:29 -0800 |
| Message-ID | <11-12-014@comp.compilers> |
| In reply to | #368 |
On Nov 28, 11:46 am, an...@mips.complang.tuwien.ac.at (Anton Ertl) wrote: > Gene <gene.ress...@gmail.com> writes: > >It's also worth noting that any grammar with A->A in it is ambiguous > >in a way that should cause LR table generation to fail. How would a > >parser know how many times to reduce A->A? > > S -> A|B > A -> A > B -> t > > This grammar can derive only the word "t", and only in one way. I see > no ambiguity; no word can be derived in two ways in this grammar. > > As for warning about these useless cycles, sure, they may indicate a > programming mistake, so better warn about them. I still don't see > that they are a problem as far as their meaning is concerned, nor > wrt. implementability in general. Apparently the Bison implementors > thought so, too, or Bison would produce an error, not a warning. We are really down in the weeds. I should have said any reduced grammar (with useless non-terminals already removed).
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web