Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2929

Re: What stage should entities be resolved? Lexical analysis stage? Syntax analysis stage? Semantic analysis stage?

From Hans-Peter Diettrich <DrDiettrich1@netscape.net>
Newsgroups comp.compilers
Subject Re: What stage should entities be resolved? Lexical analysis stage? Syntax analysis stage? Semantic analysis stage?
Date 2022-03-10 09:48 +0100
Organization Compilers Central
Message-ID <22-03-025@comp.compilers> (permalink)
References <22-03-019@comp.compilers>

Show all headers | View raw


On 3/9/22 6:22 PM, Roger L Costello wrote:

> Okay, back to XML. Consider this non-well-formed XML:
> <Publisher>Harper&amp;Row</Publsher>
> (The end-tag is misspelled)
> The &amp; is called an "XML entity." An XML parser will convert it to &. The
> other XML entities are: &lt; ... &gt; ... &quot; ... &apos;
> What stage should the entity &amp; be converted to &?

In other languages digraphs and trigraphs are used as replacements for
special characters. All such character replacements are handled at the
begin of the character input stage (lexer). In XML it also could be
handled by a preprocessor, to extend your stages:

      0.  Preprocessor
>    1.  Lexical analysis stage
>    2.  Syntax analysis stage
>    3.  Semantic analysis stage

I prefer to describe/clarify the stages by their inputs and outputs:

A preprocessor inputs and outputs a stream of characters.
A Lexer reads a character stream and outputs a stream of terminal tokens.
A Parser accepts a stream of terminals, adds non-terminals from the
grammar, and outputs e.g. a tree structure.
Semantic analysis can be done during syntax analysis or later.

> What stage should detect that the <Publisher> start-tag does not have a
> matching end-tag?

As appropriate <g>. What should be the consequence of that mismatch?
It may be a quite harmless typo than can be fixed by auto correction.
Or it may indicate a missing closing tag if it matches some previous
opening tag?
Where in your implementation can you know enough about possible reasons
for the mismatch? Error handling and helpful error messages are a wide
and stony field <sigh>.

IMO it's up to the compiler writer to match the expectations of his
audience with such problems - warning, error, re-sync or abort processing?
Or you leave the handling to some user controlled compiler flags.


Don't take too seriously what you read about the one and only way to
classify or handle something. For XML (HTML...) you have a choice of DOM
or SAX parsing. Feel free to do it your way, after you have studied the
various approaches and pitfalls, and as long as you can be sure that the
results are correct and acceptable by your boss or users.

DoDi

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

What stage should entities be resolved? Lexical analysis stage? Syntax analysis stage? Semantic analysis stage? Roger L Costello <costello@mitre.org> - 2022-03-09 17:22 +0000
  Re: What stage should entities be resolved? Lexical analysis stage? Syntax analysis stage? Semantic analysis stage? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-03-10 09:48 +0100
    Re: What stage should entities be resolved? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-03-12 14:11 +0200
      Re: What stage should entities be resolved? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-03-14 19:43 +0100
    Re: What stage should entities be resolved? Roger L Costello <costello@mitre.org> - 2022-03-15 11:49 +0000
      Re: What stage should entities be resolved? Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2022-03-18 00:31 +0100
      Re: What stage should entities be resolved? gah4 <gah4@u.washington.edu> - 2022-03-17 17:06 -0700
      Re: What stage should entities be resolved? Kaz Kylheku <480-992-1380@kylheku.com> - 2022-03-18 17:50 +0000
        Re: What stage should entities be resolved? gah4 <gah4@u.washington.edu> - 2022-03-18 14:08 -0700
        Re: What stage should entities be resolved? Martin Ward <martin@gkc.org.uk> - 2022-03-19 18:17 +0000
      Re: What stage should entities be resolved? "matt.ti...@gmail.com" <matt.timmermans@gmail.com> - 2022-03-20 07:32 -0700
  RE: What stage should entities be resolved? Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2022-03-10 12:54 +0200
  Re: What stage should entities be resolved? Lexical analysis stage? Syntax analysis stage? Semantic analysis stage? matt.timmermans@gmail.com - 2022-03-12 05:12 -0800

csiph-web