Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2937 > unrolled thread

Improved accuracy in diagnostics. Is it worthwhile?

Started by"Ev. Drikos" <drikosev@gmail.com>
First post2022-03-18 07:25 +0200
Last post2022-03-19 19:58 +0200
Articles 4 — 3 participants

Back to article view | Back to comp.compilers


Contents

  Improved accuracy in diagnostics. Is it worthwhile? "Ev. Drikos" <drikosev@gmail.com> - 2022-03-18 07:25 +0200
    Re: Improved accuracy in diagnostics. Is it worthwhile? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-03-18 16:47 +0000
    Re: Improved accuracy in diagnostics. Is it worthwhile? Thomas Koenig <tkoenig@netcologne.de> - 2022-03-18 18:12 +0000
      Re: Improved accuracy in diagnostics. Is it worthwhile? "Ev. Drikos" <drikosev@gmail.com> - 2022-03-19 19:58 +0200

#2937 — Improved accuracy in diagnostics. Is it worthwhile?

From"Ev. Drikos" <drikosev@gmail.com>
Date2022-03-18 07:25 +0200
SubjectImproved accuracy in diagnostics. Is it worthwhile?
Message-ID<22-03-035@comp.compilers>
Hello,

This is mainly a parsing question but it's also Fortran related as well.

When I make syntax checking with the command 'fcheck' in the code below,
the error message doesn't contain a '(' in the expected tokens. This
happens due to default actions, although the parser is basically LALR. A
pure LALR parser wouldn't make reductions without examininig the lookahead.

Default actions are useful because they save a lot of space in parsing
tables, at the cost of missing expected tokens in the error messages
printed by the command 'fcheck'. This is the relevant BNF rule for the
example given at the end of this message:

implicit-stmt ::=
   IMPLICIT implicit-spec-list
| IMPLICIT NONE [ ( [ implicit-none-spec-list ] ) ]


Disabling default actions for the command 'fcheck' is fairly simple,
just a button click in Syntaxis, but at the moment I can't think of
how many error messages would be improved, whereas a parsing table
increase (50%) would be granted. The command 'fcheck' can be found at
https://github.com/drikosev/Fortran

So far, my approach has been that improved diagnostics shouldn't slow
down the processing of correct programs. Is it worthwhile to improve
diagnostics by disabling default actions in a LALR parser?


Thanks,
Ev. Drikos

----------------------------------------------------------------------
$ cat default-actions.f90 && fcheck default-actions.f90
  IMPLICIT NONE ? (type, external)
  PRINT *, "Only ';', not a '(', in the expected tokens in diagnostics."
  END

default-actions.f90:1: error: syntax:Unexpected: '?'. Expected: ";".

Parsed with Errors: default-actions.f90
$
[When yacc was new and everything had to fit in 64K, small parse tables
were important.  Today when people include a megabyte library to get
a four line routine, not so much. -John]

[toc] | [next] | [standalone]


#2938

FromKaz Kylheku <480-992-1380@kylheku.com>
Date2022-03-18 16:47 +0000
Message-ID<22-03-036@comp.compilers>
In reply to#2937
On 2022-03-18, Ev. Drikos <drikosev@gmail.com> wrote:
> Hello,
>
> This is mainly a parsing question but it's also Fortran related as well.
>
> When I make syntax checking with the command 'fcheck' in the code below,
> the error message doesn't contain a '(' in the expected tokens. This
> happens due to default actions, although the parser is basically LALR. A
> pure LALR parser wouldn't make reductions without examininig the lookahead.

I think you mean default reductions?

In the case of Yacc, the action is the body { $$ = $1; }

:)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

[toc] | [prev] | [next] | [standalone]


#2940

FromThomas Koenig <tkoenig@netcologne.de>
Date2022-03-18 18:12 +0000
Message-ID<22-03-038@comp.compilers>
In reply to#2937
Ev. Drikos <drikosev@gmail.com> schrieb:

> This is mainly a parsing question but it's also Fortran related as well.

[...]

> So far, my approach has been that improved diagnostics shouldn't slow
> down the processing of correct programs.

With today's computer speeds, this is likely not a very important
consideration any more.

If you are compiling, it is usually a small fraction of time that
is spent in the parsing, and much more in optimization and code
generation.  An example: Compiling a 50 k line Fortran program with
"gfortran -O2" takes 17.4 seconds on the computer I type this on.
Checking with "gfortran -fsyntax-only" takes 4.2 seconds.  (For
those who want to reproduce: aermod.f90 from the Polyhedron suite).

50k lines for a single source files is already quite a lot (much
longer than most source files for modular programs are likely to
be) and throwing a bit more CPU time at the problem to reduce user
confusion by emitting better error messages is extremely likely
to be a win for the user.  Just be careful to avoid anything
worse than O(n log n) for code size, or somebody will come
along with a test case that takes _really_ long.

(Take the above with a grain of salt for C++ headers.)


> Is it worthwhile to improve
> diagnostics by disabling default actions in a LALR parser?

I would presume so.  Run a few benchmarks and find out.
[In my experience, lexing and optimization take most of the
time, and parsing is insignificant. -John]

[toc] | [prev] | [next] | [standalone]


#2943

From"Ev. Drikos" <drikosev@gmail.com>
Date2022-03-19 19:58 +0200
Message-ID<22-03-041@comp.compilers>
In reply to#2940
On 18/03/2022 20:12, Thomas Koenig wrote:
> ...
> If you are compiling, it is usually a small fraction of time that
> is spent in the parsing, and much more in optimization and code
> generation.  An example: Compiling a 50 k line Fortran program with
> "gfortran -O2" takes 17.4 seconds on the computer I type this on.
> Checking with "gfortran -fsyntax-only" takes 4.2 seconds.  (For
> those who want to reproduce: aermod.f90 from the Polyhedron suite).
> ...

Thanks. Just tested this large file and the runtime overhead seems
to be negligible.

Likely, I'll try the change but it took me a while to find another case
with enumerators (that also lack error recovery now). Although my trial
changes added messages for 43 states, some of them are useless and so
this approach seems to be useful for BNF rules with an optional tail.

Unavoidably, a parser/front-end has to make some guessing on error
and this doesn't change easily. So, any improvement without default
state reductions (hello Kaz) will be limited, as in the code below:


-----------------------------------------------------------------------

miniserver:errors suser$ cat enum-1.f90 && fcheck enum-1.f90
         ENUM, BIND(C)
          ENUMERATOR :: RED => 4, BLUE => 9
          ENUMERATOR YELLOW
         END ENUM
         END
enum-1.f90:2: error: syntax:Unexpected: '=>'. Expected: ",", ";", or "=".

Parsed with Errors: enum-1.f90
miniserver:errors suser$

[toc] | [prev] | [standalone]


Back to top | Article view | comp.compilers


csiph-web