Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #37654

Re: XML validation / exception.

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!hq-usenetpeers.eweka.nl!81.171.88.250.MISMATCH!newsfeed.eweka.nl!eweka.nl!feeder3.eweka.nl!newsfeed.xs4all.nl!newsfeed1.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <python-python-list@m.gmane.org>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'exception': 0.03; 'exception.': 0.07; 'parser': 0.07; 'parsing': 0.07; 'stack.': 0.07; 'comment,': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'throw': 0.09; 'component': 0.15; 'file,': 0.15; 'stack': 0.15; '(when': 0.16; 'parser.': 0.16; 'received:80.91.229.3': 0.16; 'received:dip0.t-ipconnect.de': 0.16; 'received:plane.gmane.org': 0.16; 'received:t-ipconnect.de': 0.16; 'sees': 0.16; 'subject:XML': 0.16; 'tags.': 0.16; 'task.': 0.16; '(in': 0.18; 'earlier': 0.21; 'dependent': 0.23; "python's": 0.23; 'header:User-Agent:1': 0.26; 'separate': 0.27; 'andrew': 0.27; 'header:X-Complaints-To:1': 0.28; 'writes:': 0.29; 'manual': 0.29; 'that.': 0.30; 'framework': 0.30; 'minimal': 0.30; 'expect': 0.31; 'could': 0.32; 'text,': 0.33; 'to:addr:python-list': 0.33; 'likely': 0.33; 'open': 0.35; 'there': 0.35; 'received:org': 0.36; 'closing': 0.36; 'charset:us-ascii': 0.36; 'keeps': 0.37; 'xml': 0.37; 'does': 0.37; 'subject:: ': 0.38; 'to:addr:python.org': 0.39; 'called': 0.39; 'header:Received:5': 0.40; 'your': 0.60; 'most': 0.61; 'provide': 0.62; 'between': 0.63; 'information': 0.63; 'received:217': 0.68; 'tags,': 0.81; 'captures': 0.84; 'entity,': 0.84; 'kind.': 0.84; 'mate': 0.84; 'only:': 0.84
X-Injected-Via-Gmane http://gmane.org/
To python-list@python.org
From dieter <dieter@handshake.de>
Subject Re: XML validation / exception.
Date Fri, 25 Jan 2013 09:20:05 +0100
References <5100AA5C.5070203@r3dsolutions.com>
Mime-Version 1.0
Content-Type text/plain; charset=us-ascii
X-Gmane-NNTP-Posting-Host pd9e08df5.dip0.t-ipconnect.de
User-Agent Gnus/5.1008 (Gnus v5.10.8) XEmacs/21.4.22 (linux)
Cancel-Lock sha1:/06eghxQXxJvd5ROFcdvQuwXcv0=
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1032.1359102020.2939.python-list@python.org> (permalink)
Lines 30
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1359102020 news.xs4all.nl 6884 [2001:888:2000:d::a6]:35001
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:37654

Show key headers only | View raw


Andrew Robinson <andrew3@r3dsolutions.com> writes:
> On xml.etree,
> When I scan in a handwritten XML file, and there are mismatched tags -- 
> it will throw an exception.
> and the exception will contain a line number of the closing tag which
> does not have a mate of the same kind.
>
> Is there a way to get the line number of the earlier tag which caused
> the XML parser to know the closing tag was mismatched, so I can narrow
> down the location of the mismatches for a manual repair?

This is parser dependent -- and likely not the case for the
standard parsers.

In order to check for the correspondence between opening and
closing tags, that parser must maintain a stack of open tags.
Your request can be fullfilled when the parser keeps associated
line numbers in this stack. I expect that most parser will not do that.

Python's "xml" framework is highly modularied - with each component
having only a minimal task. Especially, the parser is responsible
for parsing only: it parses and generated events for what is sees
(opening tag, closing tag, text, comment, entity, error, ...).
The events are consumend by a separate component (when I remember
right, a so called "handler"). Such a component is responsible
to create the "ETree" during the parsing.
You might be able to provide an alternative for this component
which captures (in addition) line information for opening tags.
Alternatively, you could provide an alternative parser.

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: XML validation / exception. dieter <dieter@handshake.de> - 2013-01-25 09:20 +0100

csiph-web