Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.forth > #25015
| Date | 2013-08-06 09:24 +0100 |
|---|---|
| Subject | Re: Scanning versus Parsing |
| From | Ian van Breda <igvb@btopenworld.com> |
| Newsgroups | comp.lang.forth |
| Message-ID | <CE267166.3731%igvb@btopenworld.com> (permalink) |
| References | <CE216AB0.36B5%igvb@btopenworld.com> <8IWdnZJHcowtI2bMnZ2dnUVZ_h2dnZ2d@supernews.com> |
Andrew Haley wrote on 02/08/2013 15:14: > Ian van Breda <igvb@btopenworld.com> wrote: > >> The universally accepted term for extracting language tokens from >> the input source code, as elsewhere in computing, is 'scanning'. In >> Forth this is done by extracting words delimited by spaces, tabs and >> line endings. >> >> This suggests that the vast majority of the descriptive text in the >> proposed Standard needs to be changed from variants of 'parse' to >> their 'scan' equivalents > > No, it doesn't. The term is defined in Section 2.1, Definitions of > terms, and is consistent with historic Forth usage. There are many > terms used in Forth which are not consistent with the rest of computer > science: "word" is one such. Irrespective of what the standard may say, my dictionary describes parsing as: 'Describe (word) grammatically, stating inflexion, relation to sentence, etc.; resolve (sentence) into component parts of speech and describe them grammatically'. 'Word' does indeed have many meanings in common use, so is not relevant in this context. >> The argument that it has always been done this way is surely not >> valid in the face of the fundamental usage of the term, parsing, >> elsewhere: Forth cannot ignore the outside world and, at least in >> this case, it is irrefutably common practice to use scanning for >> this type of process. > > I take your point, but it's hardly the job of the committee to > determine what language Forthers use. We could change to match > commonplace CS use, at the cost of being incosistent with all the > Forth literature. That's not a good trade-off. > >> Forth seems to be unique in using the term parsing instead of >> scanning. > > That's true, but Forth uses its own terminology for many things, and I > don't think we should break with forty years of usage. Vive la > difference! > Unfortunately, the term 'parsing', as used in Forth (no grammatical context), conflicts with the term as used in other languages and also in the theory of natural languages. I have a number of books on compiling, all starting with scanning followed by parsing of the grammar as a separate activity: Gries, Fischer et al, Wait et al, Appel, Jensen at al., typically use a BNF form of grammar, or something similar. This is the only conflict of this type in either the ANS Standard or in the proposed standard that I can see. It was clear to me, in the context of large telescope systems and in the laboratory, that it was necessary to be able accommodate a variety of languages in the Forth environment, particularly, in respect of imaging systems, where there is a variety of libraries and numerical algorithms available. One of the main complaints made by astronomers was that Forth cannot in be easily integrated with other languages. However, there is a problem with *other* languages in that they generally us a single stack for both data and return addresses. Forth has a considerable advantage in using separate data and return stacks, particularly allowing compound definitions to be implemented. However, this is not a problem *if* the other language uses a separate data stack, in which case, we can integrate the two approaches. This is a failing in the *implementation* of other languages. Another advantage of Forth over other operating systems, is in it's use of multi-tasking by its very nature: the original Forth came with multi-tasking as part and parcel of the system, both terminal and background tasks. In this context, it was easy to implement time-slicing. This allows the system to respond to interrupts very efficiently for both low level and high level events. By using a task-specific user-table means that the response to interrupts is as good as it gets. The problem here was to build a compiler for other languages which could be integrated into Forth. The cornerstones in generating such a compiler are scanning and parsing. For the first part, scanning, is particularly easy in Forth, where the source text is done (mostly) by splitting it into 'words', but in languages such as C or Pascal, a rather more complicated process is needed, e.g. x=y+z; will need to be separated into six tokens, but all juxtaposed in this case. The second part is to treat the grammar as a BNF file that can be simply INCLUDEd. Each 'production' is similar to a series of named colon-style definitions, one for each type of 'nonterminal' (left-hand side). It is a bit more complicated than a series of colon definitions in that there may have more that one 'production' for a given definition and two or more productions may begin with the the same phrase. This can be used for to generate parse tables in Forth for an given language that uses a BNF-style of grammar. The resulting tables can both be used in Forth itself or on any other platform and can be used to generates compilers for different languages. The result is both LL and LR compatible - usually a simplified version of LR compatibility is used but Forth generates fully LR compatible tables. The problem comes with the use of parsing instead of scanning where the Forth standard meets the other languages head-on. Of course, it sounds better if you use 'parsing' instead of the rather more mundane 'scanning'.
Back to comp.lang.forth | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-02 13:54 +0100
Re: Scanning versus Parsing Andrew Haley <andrew29@littlepinkcloud.invalid> - 2013-08-02 09:14 -0500
Re: Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-06 09:24 +0100
Re: Scanning versus Parsing anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2013-08-02 15:25 +0000
Re: Scanning versus Parsing albert@spenarnc.xs4all.nl (Albert van der Horst) - 2013-08-06 10:28 +0000
Re: Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-06 14:31 +0100
Re: Scanning versus Parsing albert@spenarnc.xs4all.nl (Albert van der Horst) - 2013-08-06 17:47 +0000
Re: Scanning versus Parsing Hans Bezemer <the.beez.speaks@gmail.com> - 2013-08-11 19:20 +0200
Re: Scanning versus Parsing Ian van Breda <igvb@btopenworld.com> - 2013-08-15 11:07 +0100
csiph-web