Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #450 > unrolled thread
| Started by | Geovani de Souza <geovanisouza92@gmail.com> |
|---|---|
| First post | 2012-02-11 06:56 -0800 |
| Last post | 2012-02-14 00:25 +0000 |
| Articles | 12 — 12 participants |
Back to article view | Back to comp.compilers
Ignore break line sometimes Geovani de Souza <geovanisouza92@gmail.com> - 2012-02-11 06:56 -0800
Re: Ignore break line sometimes Hans-Peter Diettrich <DrDiettrich1@aol.com> - 2012-02-11 17:28 +0100
Re: Ignore break line sometimes George Neuner <gneuner2@comcast.net> - 2012-02-11 12:59 -0500
RE: Ignore break line sometimes "Karsten Nyblad" <uu3kw29sb7@snkmail.com> - 2012-02-12 09:21 +0100
Re: Ignore break line sometimes Kaz Kylheku <kaz@kylheku.com> - 2012-02-13 00:16 +0000
Re: Ignore break line sometimes Stefan Monnier <monnier@iro.umontreal.ca> - 2012-02-12 10:48 -0500
Re: Ignore break line sometimes Joshua Cranmer <Pidgeot18@verizon.invalid> - 2012-02-12 12:03 -0600
Re: Ignore break line sometimes Gene Wirchenko <genew@ocis.net> - 2012-02-19 20:57 -0800
Re: Ignore break line sometimes glen herrmannsfeldt <gah@ugcs.caltech.edu> - 2012-02-20 08:09 +0000
Re: Ignore break line sometimes arnold@skeeve.com (Aharon Robbins) - 2012-02-23 21:51 +0000
Re: Ignore break line sometimes "Jonathan Thornburg" <jthorn@astro.indiana.edu> - 2012-02-27 03:49 +0000
Re: Ignore break line sometimes "BartC" <bc@freeuk.com> - 2012-02-14 00:25 +0000
| From | Geovani de Souza <geovanisouza92@gmail.com> |
|---|---|
| Date | 2012-02-11 06:56 -0800 |
| Subject | Ignore break line sometimes |
| Message-ID | <12-02-010@comp.compilers> |
Hi all! I'm trying write an parser to my compiler, and I'm interessed to ignore the break line (\n) sometimes. E.g: if true then [\n] foo(); [\n] end; [\n] So, in the first line, the '\n' after 'then' isn't important, but in the second "foo();" could replace the need of the semicolon to conclude the statement, or still, in the 'end'. Too ignore '\n' in the white lines. How can I do this?
[toc] | [next] | [standalone]
| From | Hans-Peter Diettrich <DrDiettrich1@aol.com> |
|---|---|
| Date | 2012-02-11 17:28 +0100 |
| Message-ID | <12-02-011@comp.compilers> |
| In reply to | #450 |
Geovani de Souza schrieb: > I'm trying write an parser to my compiler, and I'm interessed to > ignore the break line (\n) sometimes. E.g: > > if true then [\n] foo(); [\n] end; [\n] > > So, in the first line, the '\n' after 'then' isn't important, but in > the second "foo();" could replace the need of the semicolon to > conclude the statement, or still, in the 'end'. That's why many (compiled) languages ignore line ends and other whitespace, and require explicit statement termination, e.g. by a semicolon. Interpreters instead often prefer the "one statement per line" approach, with the option to concatenate statements by e.g. a colon. IMO you should make a decision about the meaning of whitespace in general, and of line endings in detail, in your language. Please give an example that would compile differently when linefeeds are removed, and then answer yourself the question whether this really will make sense. DoDi
[toc] | [prev] | [next] | [standalone]
| From | George Neuner <gneuner2@comcast.net> |
|---|---|
| Date | 2012-02-11 12:59 -0500 |
| Message-ID | <12-02-012@comp.compilers> |
| In reply to | #450 |
On Sat, 11 Feb 2012 06:56:17 -0800 (PST), Geovani de Souza <geovanisouza92@gmail.com> wrote: >I'm trying write an parser to my compiler, and I'm interessed > to ignore the break line (\n) sometimes. E.g: > >if true then [\n] > foo(); [\n] >end; [\n] > >So, in the first line, the '\n' after 'then' isn't important, but in >the second "foo();" could replace the need of the semicolon >to conclude the statement, or still, in the 'end'. > >Too ignore '\n' in the white lines. > >How can I do this? IMO making the newlines significant is a really bad idea ... but leaving that aside I believe the most effective way would be to have your lexer return a special "end-of-line" code for either semicolon or newline and make the end-of-line code optional where it need not be. You don't say whether your parser is handwritten or tool generated (or which tools) ... so I can't really give an example. George
[toc] | [prev] | [next] | [standalone]
| From | "Karsten Nyblad" <uu3kw29sb7@snkmail.com> |
|---|---|
| Date | 2012-02-12 09:21 +0100 |
| Message-ID | <12-02-015@comp.compilers> |
| In reply to | #450 |
> I'm trying write an parser to my compiler, and I'm interessed to ignore the break line (\n) sometimes. E.g: > > if true then [\n] > foo(); [\n] > end; [\n] One option is to write a recursive descendent parser, and have two ways of calling the lexer: One that return line ends and one that does not. An other option is to base your parsing on a parser generator like bison, and modify the code that drives the automaton. That code is modified such that when the lexer returns a line feed token, you copy the stack of states, and on the copy you simulate the actions that the parser would have taken. When the simulation stacks the line feed, you throw away the copy and resume parsing on the real stack with the line feed in the window. When the simulation encounters an error, you throw away the simulation AND the line feed and call the lexer again. If you chose the second option, it is important that you chose the right parser generator, because some parser generators already generate code that can help you. Many LR parser generators, e.g., bison, include facilities for generalised LR parsing, and many LL parser generators include facilities for backtracking. That might help you. Karsten Nyblad
[toc] | [prev] | [next] | [standalone]
| From | Kaz Kylheku <kaz@kylheku.com> |
|---|---|
| Date | 2012-02-13 00:16 +0000 |
| Message-ID | <12-02-018@comp.compilers> |
| In reply to | #455 |
On 2012-02-12, Karsten Nyblad <uu3kw29sb7@snkmail.com> wrote: > that can help you. Many LR parser generators, e.g., bison, include > facilities for generalised LR parsing, and many LL parser generators > include facilities for backtracking. That might help you. General LR and backtracking, just to make semicolons optional when there are newlines? LOL.
[toc] | [prev] | [next] | [standalone]
| From | Stefan Monnier <monnier@iro.umontreal.ca> |
|---|---|
| Date | 2012-02-12 10:48 -0500 |
| Message-ID | <12-02-016@comp.compilers> |
| In reply to | #450 |
> So, in the first line, the '\n' after 'then' isn't important, but in the
> second "foo();" could replace the need of the semicolon to conclude the
> statement, or still, in the 'end'.
A simple approach is to treat every newline as a semi-colon, and then to
adapt your grammar so as to accept (and ignore) extra semi-colons.
I.e. accept "if true then; foo(); ; end; ;"
Stefan
[toc] | [prev] | [next] | [standalone]
| From | Joshua Cranmer <Pidgeot18@verizon.invalid> |
|---|---|
| Date | 2012-02-12 12:03 -0600 |
| Message-ID | <12-02-017@comp.compilers> |
| In reply to | #450 |
On 2/11/2012 8:56 AM, Geovani de Souza wrote: > Hi all! > > I'm trying write an parser to my compiler, and I'm interessed to > ignore the break line (\n) sometimes. E.g: > > if true then [\n] foo(); [\n] end; [\n] > > So, in the first line, the '\n' after 'then' isn't important, but in > the second "foo();" could replace the need of the semicolon to > conclude the statement, or still, in the 'end'. It sounds like you want something like ECMAScript's magic you-don't-always-need-a-semicolon feature. <http://bclary.com/2004/11/07/#a-7.9> describes how it works in detail. The thrust of it is that "if you see an invalid token, but you saw a newline before, automatically insert a semicolon to fix things." There are more than a few people who believe that this feature should not have been implemented. -- Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald E. Knuth
[toc] | [prev] | [next] | [standalone]
| From | Gene Wirchenko <genew@ocis.net> |
|---|---|
| Date | 2012-02-19 20:57 -0800 |
| Message-ID | <12-02-023@comp.compilers> |
| In reply to | #457 |
On Sun, 12 Feb 2012 12:03:13 -0600, Joshua Cranmer
<Pidgeot18@verizon.invalid> wrote:
[snip]
>It sounds like you want something like ECMAScript's magic
>you-don't-always-need-a-semicolon feature.
But please do not go there.
><http://bclary.com/2004/11/07/#a-7.9> describes how it works in detail.
>The thrust of it is that "if you see an invalid token, but you saw a
>newline before, automatically insert a semicolon to fix things."
>
>There are more than a few people who believe that this feature should
>not have been implemented.
There is a bit more to this. As a result of this kludge, it is
illegal to have newlines at certain points in some statements. For
example:
return
<expression which I decided to put all on its own line>;
is not legal. It is not permitted to have a newline immediately after
"return".
Sincerely,
Gene Wirchenko
[toc] | [prev] | [next] | [standalone]
| From | glen herrmannsfeldt <gah@ugcs.caltech.edu> |
|---|---|
| Date | 2012-02-20 08:09 +0000 |
| Message-ID | <12-02-024@comp.compilers> |
| In reply to | #463 |
Gene Wirchenko <genew@ocis.net> wrote: (snip, someone wrote) >> There are more than a few people who believe that this >> feature should not have been implemented. > There is a bit more to this. As a result of this kludge, it is > illegal to have newlines at certain points in some statements. > For example: > return > <expression which I decided to put all on its own line>; > is not legal. It is not permitted to have a newline immediately after > "return". Sounds about like the way IBM's JCL from OS/360 and successors works. You can split a statement after a comma in most cases, and continue it on the next line, after the usual // and some spaces. I believe the original (early) versions had a more usual system with a continuation character in column 72, and then start the next statement in column 16. I presume it was found hard to get right so they changed it. I believe that there are a few other languages with a similar continuation method. That is, if you end a statement in a legal end, no continuation is needed. -- glen
[toc] | [prev] | [next] | [standalone]
| From | arnold@skeeve.com (Aharon Robbins) |
|---|---|
| Date | 2012-02-23 21:51 +0000 |
| Message-ID | <12-02-025@comp.compilers> |
| In reply to | #464 |
glen herrmannsfeldt <gah@ugcs.caltech.edu> wrote: >I believe that there are a few other languages with a similar >continuation method. That is, if you end a statement in a legal >end, no continuation is needed. Awk is like this. You can continue after a comma, && or ||. Possibly in other places too. You can supply semicolons to separate statements on the same line, if you want. It tends to work fairly naturally in awk, I rarely use \ to continue onto the next line. :-) -- Aharon (Arnold) Robbins arnold AT skeeve DOT com P.O. Box 354 Home Phone: +972 8 979-0381 Nof Ayalon Cell Phone: +972 50 729-7545 D.N. Shimshon 99785 ISRAEL
[toc] | [prev] | [next] | [standalone]
| From | "Jonathan Thornburg" <jthorn@astro.indiana.edu> |
|---|---|
| Date | 2012-02-27 03:49 +0000 |
| Message-ID | <12-02-027@comp.compilers> |
| In reply to | #465 |
Aharon Robbins <arnold@skeeve.com> wrote:
> Awk is like this. You can continue after a comma, && or ||. Possibly
> in other places too. You can supply semicolons to separate statements
> on the same line, if you want.
>
> It tends to work fairly naturally in awk, I rarely use \ to continue
> onto the next line. :-)
On the other hand pic (Kernighan's picture-drawing "little language")
is very finicky about where it accepts \ line-continuations, allowing
them in some places but forbidding them in others. For example, the
pic code
for j = 2 to 6 by 2 do { \
for i = 3 to 7 by 2 do { \
fine_space_interp_point at grid_point(j,i) } }
does NOT allow a \ line-continuation between either "for" and the
following "{". (Or more precisely, all my attempts to make such
produced the usual unhelpful pic syntax-error messages.) :(
--
-- "Jonathan Thornburg
Dept of Astronomy & IUCSS, Indiana University, Bloomington, Indiana, USA
"Washing one's hands of the conflict between the powerful and the
powerless means to side with the powerful, not to be neutral."
-- quote by Freire / poster by Oxfam
[toc] | [prev] | [next] | [standalone]
| From | "BartC" <bc@freeuk.com> |
|---|---|
| Date | 2012-02-14 00:25 +0000 |
| Message-ID | <12-02-020@comp.compilers> |
| In reply to | #450 |
"Geovani de Souza" <geovanisouza92@gmail.com> wrote > I'm trying write an parser to my compiler, and I'm interessed to ignore > the break line (\n) sometimes. E.g: > > if true then [\n] > foo(); [\n] > end; [\n] > > So, in the first line, the '\n' after 'then' isn't important, but in the > second "foo();" could replace the need of the semicolon to conclude the > statement, or still, in the 'end'. > > To ignore '\n' in the white lines. I've tried a few schemes. One just converts a newline to a semicolon, *unless* the last symbol was (for example) a comma. This requires some sort of continuation symbol for when a semicolon would be inappropriate. And it helps if the grammar is tolerant of extra semicolons, otherwise the source code could be full of continuation symbols! (After 'then' for example.) Whatever scheme you choose, you'll know it works well when you have thousands of lines of code without a single semicolon, and hardly any continuations. And that is perfectly clear to read. -- Bartc
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web