Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #2937 > unrolled thread
| Started by | "Ev. Drikos" <drikosev@gmail.com> |
|---|---|
| First post | 2022-03-18 07:25 +0200 |
| Last post | 2022-03-19 19:58 +0200 |
| Articles | 4 — 3 participants |
Back to article view | Back to comp.compilers
Improved accuracy in diagnostics. Is it worthwhile? "Ev. Drikos" <drikosev@gmail.com> - 2022-03-18 07:25 +0200
Re: Improved accuracy in diagnostics. Is it worthwhile? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-03-18 16:47 +0000
Re: Improved accuracy in diagnostics. Is it worthwhile? Thomas Koenig <tkoenig@netcologne.de> - 2022-03-18 18:12 +0000
Re: Improved accuracy in diagnostics. Is it worthwhile? "Ev. Drikos" <drikosev@gmail.com> - 2022-03-19 19:58 +0200
| From | "Ev. Drikos" <drikosev@gmail.com> |
|---|---|
| Date | 2022-03-18 07:25 +0200 |
| Subject | Improved accuracy in diagnostics. Is it worthwhile? |
| Message-ID | <22-03-035@comp.compilers> |
Hello,
This is mainly a parsing question but it's also Fortran related as well.
When I make syntax checking with the command 'fcheck' in the code below,
the error message doesn't contain a '(' in the expected tokens. This
happens due to default actions, although the parser is basically LALR. A
pure LALR parser wouldn't make reductions without examininig the lookahead.
Default actions are useful because they save a lot of space in parsing
tables, at the cost of missing expected tokens in the error messages
printed by the command 'fcheck'. This is the relevant BNF rule for the
example given at the end of this message:
implicit-stmt ::=
IMPLICIT implicit-spec-list
| IMPLICIT NONE [ ( [ implicit-none-spec-list ] ) ]
Disabling default actions for the command 'fcheck' is fairly simple,
just a button click in Syntaxis, but at the moment I can't think of
how many error messages would be improved, whereas a parsing table
increase (50%) would be granted. The command 'fcheck' can be found at
https://github.com/drikosev/Fortran
So far, my approach has been that improved diagnostics shouldn't slow
down the processing of correct programs. Is it worthwhile to improve
diagnostics by disabling default actions in a LALR parser?
Thanks,
Ev. Drikos
----------------------------------------------------------------------
$ cat default-actions.f90 && fcheck default-actions.f90
IMPLICIT NONE ? (type, external)
PRINT *, "Only ';', not a '(', in the expected tokens in diagnostics."
END
default-actions.f90:1: error: syntax:Unexpected: '?'. Expected: ";".
Parsed with Errors: default-actions.f90
$
[When yacc was new and everything had to fit in 64K, small parse tables
were important. Today when people include a megabyte library to get
a four line routine, not so much. -John]
[toc] | [next] | [standalone]
| From | Kaz Kylheku <480-992-1380@kylheku.com> |
|---|---|
| Date | 2022-03-18 16:47 +0000 |
| Message-ID | <22-03-036@comp.compilers> |
| In reply to | #2937 |
On 2022-03-18, Ev. Drikos <drikosev@gmail.com> wrote:
> Hello,
>
> This is mainly a parsing question but it's also Fortran related as well.
>
> When I make syntax checking with the command 'fcheck' in the code below,
> the error message doesn't contain a '(' in the expected tokens. This
> happens due to default actions, although the parser is basically LALR. A
> pure LALR parser wouldn't make reductions without examininig the lookahead.
I think you mean default reductions?
In the case of Yacc, the action is the body { $$ = $1; }
:)
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
[toc] | [prev] | [next] | [standalone]
| From | Thomas Koenig <tkoenig@netcologne.de> |
|---|---|
| Date | 2022-03-18 18:12 +0000 |
| Message-ID | <22-03-038@comp.compilers> |
| In reply to | #2937 |
Ev. Drikos <drikosev@gmail.com> schrieb: > This is mainly a parsing question but it's also Fortran related as well. [...] > So far, my approach has been that improved diagnostics shouldn't slow > down the processing of correct programs. With today's computer speeds, this is likely not a very important consideration any more. If you are compiling, it is usually a small fraction of time that is spent in the parsing, and much more in optimization and code generation. An example: Compiling a 50 k line Fortran program with "gfortran -O2" takes 17.4 seconds on the computer I type this on. Checking with "gfortran -fsyntax-only" takes 4.2 seconds. (For those who want to reproduce: aermod.f90 from the Polyhedron suite). 50k lines for a single source files is already quite a lot (much longer than most source files for modular programs are likely to be) and throwing a bit more CPU time at the problem to reduce user confusion by emitting better error messages is extremely likely to be a win for the user. Just be careful to avoid anything worse than O(n log n) for code size, or somebody will come along with a test case that takes _really_ long. (Take the above with a grain of salt for C++ headers.) > Is it worthwhile to improve > diagnostics by disabling default actions in a LALR parser? I would presume so. Run a few benchmarks and find out. [In my experience, lexing and optimization take most of the time, and parsing is insignificant. -John]
[toc] | [prev] | [next] | [standalone]
| From | "Ev. Drikos" <drikosev@gmail.com> |
|---|---|
| Date | 2022-03-19 19:58 +0200 |
| Message-ID | <22-03-041@comp.compilers> |
| In reply to | #2940 |
On 18/03/2022 20:12, Thomas Koenig wrote:
> ...
> If you are compiling, it is usually a small fraction of time that
> is spent in the parsing, and much more in optimization and code
> generation. An example: Compiling a 50 k line Fortran program with
> "gfortran -O2" takes 17.4 seconds on the computer I type this on.
> Checking with "gfortran -fsyntax-only" takes 4.2 seconds. (For
> those who want to reproduce: aermod.f90 from the Polyhedron suite).
> ...
Thanks. Just tested this large file and the runtime overhead seems
to be negligible.
Likely, I'll try the change but it took me a while to find another case
with enumerators (that also lack error recovery now). Although my trial
changes added messages for 43 states, some of them are useless and so
this approach seems to be useful for BNF rules with an optional tail.
Unavoidably, a parser/front-end has to make some guessing on error
and this doesn't change easily. So, any improvement without default
state reductions (hello Kaz) will be limited, as in the code below:
-----------------------------------------------------------------------
miniserver:errors suser$ cat enum-1.f90 && fcheck enum-1.f90
ENUM, BIND(C)
ENUMERATOR :: RED => 4, BLUE => 9
ENUMERATOR YELLOW
END ENUM
END
enum-1.f90:2: error: syntax:Unexpected: '=>'. Expected: ",", ";", or "=".
Parsed with Errors: enum-1.f90
miniserver:errors suser$
[toc] | [prev] | [standalone]
Back to top | Article view | comp.compilers
csiph-web