Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.compilers > #2696
| From | Christopher F Clark <christopher.f.clark@compiler-resources.com> |
|---|---|
| Newsgroups | comp.compilers |
| Subject | Is it the job of a parser to validate the input data? |
| Date | 2021-08-12 15:10 +0300 |
| Organization | Compilers Central |
| Message-ID | <21-08-011@comp.compilers> (permalink) |
Roger L Costello <costello@mitre.org> asked: > There are many data formats which contain things like this: > > A number, N > N occurrences of something > > For example, 3 followed by the names of three students: > > 3 > John Doe > Sally Smith > Judy Jones > > I have a question about parsing such data. Is it the job of a parser to ensure > that the number of student names matches the number? Or, is it the job of the > parser to merely tokenize whatever is in the input and then create an abstract > syntax tree containing the tokens? It is almost always done in the AST creation routines, not only do you as our insightful moderator mentioned generally get better error messages that way, but curiously, the features of extract a number, turn it into a count, and apply that count (and yes those might be 3 distinct operations) to be how many items a list involves has not been implemented in any parser generator or lexer generator that I have ever seen. That's a bizarre omission, particularly since it is a common feature in many languages like networking protocols. Doing fixed counts isn't rare, but doing a count held in a "register" or "variable" seems to not be done. The conversion step should generally be deferred to "semantic (aka action) code or a predicate" as the process is messy and best handled by some well tuned code not something a lexer/parser generator just outputs and hopes it is semantically correct. I have i(all 3 steps) on my near-endless to-do list to fix that for Yacc++, but it isn't near the top of it. By the way, when working with Michella Becchi on doing a hardware regular expression engine at Intel, she studied the problem of counted regular expressions and proposed some interesting implementation details of how to handle them. Anyone interested in high speed regular expression implementations would be well advised to look up her papers on the topic. -- ****************************************************************************** Chris Clark email: christopher.f.clark@compiler-resources.com Compiler Resources, Inc. Web Site: http://world.std.com/~compres 23 Bailey Rd voice: (508) 435-5016 Berlin, MA 01503 USA twitter: @intel_chris ------------------------------------------------------------------------------
Back to comp.compilers | Previous | Next — Next in thread | Find similar
Is it the job of a parser to validate the input data? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2021-08-12 15:10 +0300 Re: Is it the job of a parser to validate the input data? luser droog <luser.droog@gmail.com> - 2021-09-03 21:37 -0700
csiph-web