Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2695 > unrolled thread

Is it the job of a parser to validate the input data?

Started byRoger L Costello <costello@mitre.org>
First post2021-08-11 22:24 +0000
Last post2021-08-12 09:34 -0400
Articles 2 — 2 participants

Back to article view | Back to comp.compilers


Contents

  Is it the job of a parser to validate the input data? Roger L Costello <costello@mitre.org> - 2021-08-11 22:24 +0000
    Re: Is it the job of a parser to validate the input data? George Neuner <gneuner2@comcast.net> - 2021-08-12 09:34 -0400

#2695 — Is it the job of a parser to validate the input data?

FromRoger L Costello <costello@mitre.org>
Date2021-08-11 22:24 +0000
SubjectIs it the job of a parser to validate the input data?
Message-ID<21-08-010@comp.compilers>
Hello Compiler Experts!

There are many data formats which contain things like this:

A number, N
N occurrences of something

For example, 3 followed by the names of three students:

3
John Doe
Sally Smith
Judy Jones

I have a question about parsing such data. Is it the job of a parser to ensure
that the number of student names matches the number? Or, is it the job of the
parser to merely tokenize whatever is in the input and then create an abstract
syntax tree containing the tokens?

I imagine you will tell me, "it depends". But what is typically the case?

/Roger
[You can indeed do it either way.  I prefer to do the counting in the AST creation
so it can produce errors like "too few names" rather than a generic "syntax error",
although putting it all in the parser makes it more likely that the language you
parse is actually the language you think you're parsing. -John]

[toc] | [next] | [standalone]


#2697

FromGeorge Neuner <gneuner2@comcast.net>
Date2021-08-12 09:34 -0400
Message-ID<21-08-012@comp.compilers>
In reply to#2695
On Wed, 11 Aug 2021 22:24:49 +0000, Roger L Costello
<costello@mitre.org> wrote:
>There are many data formats which contain things like this:
>
>A number, N
>N occurrences of something
>
>For example, 3 followed by the names of three students:
>
>3
>John Doe
>Sally Smith
>Judy Jones
>
>I have a question about parsing such data. Is it the job of a parser to ensure
>that the number of student names matches the number? Or, is it the job of the
>parser to merely tokenize whatever is in the input and then create an abstract
>syntax tree containing the tokens?
>
>I imagine you will tell me, "it depends". But what is typically the case?

It's the job of a parser to ensure that the input's syntax is correct.
What that means exactly is up to the developer.

If you consider that in your 'language' a list consists of a number
followed by exactly that many strings ... well then you could argue
that the parser should enforce that.

However, as John mentioned, often it is difficult to generate really
meaningful error messages during parsing.  I would contend that in
your example the /syntax/ of lists is really is a number followed by
zero or more strings (number string*), and that verifying the string
count is semantics, not syntax.  I believe that, whenever possible,
semantics are best left until after parsing is finished.

YMMV,
George

[toc] | [prev] | [standalone]


Back to top | Article view | comp.compilers


csiph-web