Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #33254

Re: xml data or other?

Date 2012-11-13 13:01 -0500
From Dave Angel <d@davea.name>
Subject Re: xml data or other?
References <509CFD13.9080206@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.3638.1352829700.27098.python-list@python.org> (permalink)

Show all headers | View raw


On 11/09/2012 07:54 AM, Artie Ziff wrote:
> Hello,
>
> I want to process XML-like data like this:
>
> <testname=ltpacpi.sh>
>     <description>
>         ACPI (Advanced Control Power & Integration) testscript for 2.5
> kernels.
>
>     <\description>
>     <test_location>
>         ltp/testcases/kernel/device-drivers/acpi/ltpacpi.sh
>     <\test_location>
> <\testname>
> <snip...>
>
>
> Is there a name for the format above (perhaps xhtml)?

The only word I can think of is "broken."  xml and html and xhtml all
use forward slashes.

> I'd like to find a python module that can translate it to proper xml.
> Does one exist? etree?
>

I think you've already figured it out.    Just take your description and
turn it into Python.  in other words, replace all "<\" with "</" and
perhaps " \>" with " /", although your example doesn't happen to have
any of these.  Tack a  xml header on, and try to parse it with etree. 
If you can't, then let someone manually fix it.

Or better, fix the program upstream that's creating this mess.  There
isn't a reliable way to "fix" all the possible broken xml it might be
creating, without reverse engineering it.



-- 

DaveA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: xml data or other? Dave Angel <d@davea.name> - 2012-11-13 13:01 -0500

csiph-web