Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.forth > #24933
| Date | 2013-08-02 13:54 +0100 |
|---|---|
| Subject | Scanning versus Parsing |
| From | Ian van Breda <igvb@btopenworld.com> |
| Newsgroups | comp.lang.forth |
| Message-ID | <CE216AB0.36B5%igvb@btopenworld.com> (permalink) |
The terms 'parse' and 'parsing' are widely use in the documentation on both in the original ANS Standard of 1994, [1], and in the draft Proposal, [2]. However, the term 'parsing', in both the treatment of natural language and the theory of computer languages, implies analysis of the structure of a sentence or language construct. In computing it refers to checking that the tokens extracted from the source code of a program follow the grammatical rules of the language [3] and [4]. There are many examples in the literature. For example, parsing applies to the 'productions' that form the grammar of a language, such as in a while statement in Pascal <while_statement> -> while <boolean_expression> do <statement> ; which follows the rule that a while statement must begin with a 'while' token, followed by a 'boolean expression'. This in turn is followed by a 'do' token, followed by a 'statement', which itself may be a compound statement. The universally accepted term for extracting language tokens from the input source code, as elsewhere in computing, is 'scanning'. In Forth this is done by extracting words delimited by spaces, tabs and line endings. This suggests that the vast majority of the descriptive text in the proposed Standard needs to be changed from variants of 'parse' to their 'scan' equivalents We cannot change PARSE itself, as it is now cast in stone (the penalty for inaccuracy in setting up the ANS Standard). However, it is possible to adopt SCAN-NAME in place of PARSE-NAME. It would also be useful to have a multi-line SCAN which bypasses comments. The argument that it has always been done this way is surely not valid in the face of the fundamental usage of the term, parsing, elsewhere: Forth cannot ignore the outside world and, at least in this case, it is irrefutably common practice to use scanning for this type of process. Forth seems to be unique in using the term parsing instead of scanning. This problem is highlighted when we try to use Forth to generate parsers for other languages, for which Forth works extremely well. [1] ANS Forth (ANS X3.215-1994 Information Systems Programming Languages FORTH), 1994. [2] Forth Standards Committee. Forth 200x Draft 11.1. 29th February 2012. [3] C. N. Fischer, R. K. Cytron and R. J. LeBlanc. Crafting a Compiler. Pearson Education, Inc, 2010. [4] I.G. van Breda. Building an LR parser for Pascal using Forth. EuroForth 2012. Ian van Breda
Back to comp.lang.forth | Previous | Next — Next in thread | Find similar | Unroll thread
Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-02 13:54 +0100
Re: Scanning versus Parsing Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-08-02 09:14 -0500
Re: Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-06 09:24 +0100
Re: Scanning versus Parsing anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-08-02 15:25 +0000
Re: Scanning versus Parsing albert@spenarnc.xs4all.nl (Albert van der Horst) - 2013-08-06 10:28 +0000
Re: Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-06 14:31 +0100
Re: Scanning versus Parsing albert@spenarnc.xs4all.nl (Albert van der Horst) - 2013-08-06 17:47 +0000
Re: Scanning versus Parsing Hans Bezemer <the.beez.speaks@gmail.com> - 2013-08-11 19:20 +0200
Re: Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-15 11:07 +0100
csiph-web