Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #100718 > unrolled thread

Newbie XML problem

Started byKP <kai.peters@gmail.com>
First post2015-12-21 20:29 -0800
Last post2015-12-22 11:03 -0800
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  Newbie XML problem KP <kai.peters@gmail.com> - 2015-12-21 20:29 -0800
    Re: Newbie XML problem Miki Tebeka <miki.tebeka@gmail.com> - 2015-12-21 21:49 -0800
    Re: Newbie XML problem jmp <jeanmichel@sequans.com> - 2015-12-22 13:27 +0100
    Re: Newbie XML problem KP <kai.peters@gmail.com> - 2015-12-22 11:03 -0800

#100718 — Newbie XML problem

FromKP <kai.peters@gmail.com>
Date2015-12-21 20:29 -0800
SubjectNewbie XML problem
Message-ID<3d2e5064-9cc0-43de-a708-faf528a795ca@googlegroups.com>
From my first foray into XML with Python:

I would like to retrieve this list from the XML upon searching for the 'config' with id attribute = 'B'


config = {id: 1, canvas: (3840, 1024), comment: "a comment",
             {id: 4, gate: 3, (0,    0, 1280, 1024)},   
             {id: 5, gate: 2, (1280, 0, 2560, 1024)},
             {id: 6, gate: 1, (2560, 0, 3840, 1024)}}

I have started to use this code - but this is beginning to feel very non-elegant; not the cool Python code I usually see...

import xml.etree.ElementTree as ET

tree   = ET.parse('master.xml')
master = tree.getroot()

for config in master:
    if config.attrib['id'] == 'B':
    ...  

Thanks for any help!


<?xml version="1.0" encoding="UTF-8"?>

<master>
	<config id="A">
		<canvas>3840,1024</canvas>
		<comment>"bla"</comment>
		<panel>
			<id>1</id>
			<gate>6</gate>
			<coordinates>0,0,1280,1024</coordinates>
		</panel>
		<panel>
			<id>2</id>
			<gate>5</gate>
			<coordinates>1280,0,2560,1024</coordinates>
		</panel>
		<panel>
			<id>3</id>
			<gate>4</gate>
			<coordinates>2560,0,3840,1024</coordinates>
		</panel>
	</config>
	<config id="B">
		<canvas>3840,1024</canvas>
		<comment>"a comment"</comment>
		<panel>
			<id>4</id>
			<gate>3</gate>
			<coordinates>0,0,1280,1024</coordinates>
		</panel>
		<panel>
			<id>5</id>
			<gate>2</gate>
			<coordinates>1280,0,2560,1024</coordinates>
		</panel>
		<panel>
			<id>6</id>
			<gate>1</gate>
			<coordinates>2560,0,3840,1024</coordinates>
		</panel>
	</config>
<master>

[toc] | [next] | [standalone]


#100719

FromMiki Tebeka <miki.tebeka@gmail.com>
Date2015-12-21 21:49 -0800
Message-ID<68e4a162-235a-4b84-9ea6-a3df001facbb@googlegroups.com>
In reply to#100718
Hi,

> config = {id: 1, canvas: (3840, 1024), comment: "a comment",
>              {id: 4, gate: 3, (0,    0, 1280, 1024)},   
>              {id: 5, gate: 2, (1280, 0, 2560, 1024)},
>              {id: 6, gate: 1, (2560, 0, 3840, 1024)}}
This is not valid Python. Are you trying to have a list of dicts?

> I have started to use this code - but this is beginning to feel very non-elegant; not the cool Python code I usually see...
ElementTree supports XPATH, using this and some list comprehension you do things a little more easily. For example:

    cfg = root.find('config[@id="B"]')
    for panel in cfg.findall('panel'):
        panels.append([{elem.tag: elem.text} for elem in panel])

You'll need to do some parsing for the coordinates and handle canvas and comments separately.
      
HTH,
Miki

[toc] | [prev] | [next] | [standalone]


#100726

Fromjmp <jeanmichel@sequans.com>
Date2015-12-22 13:27 +0100
Message-ID<mailman.57.1450787261.2237.python-list@python.org>
In reply to#100718
On 12/22/2015 05:29 AM, KP wrote:
>
>  From my first foray into XML with Python:
>
> I would like to retrieve this list from the XML upon searching for the 'config' with id attribute = 'B'
>
>
> config = {id: 1, canvas: (3840, 1024), comment: "a comment",
>               {id: 4, gate: 3, (0,    0, 1280, 1024)},
>               {id: 5, gate: 2, (1280, 0, 2560, 1024)},
>               {id: 6, gate: 1, (2560, 0, 3840, 1024)}}
>
> I have started to use this code - but this is beginning to feel very non-elegant; not the cool Python code I usually see...
>
> import xml.etree.ElementTree as ET
>
> tree   = ET.parse('master.xml')
> master = tree.getroot()
>
> for config in master:
>      if config.attrib['id'] == 'B':
>      ...

It much depends on

1/ the xml parser you use.
2/ the xml structure

1/ I'm happily using beautifulSoup. Using it is really simple and yield 
simple code.

2/ Whenever the code gets complicated is because the xml is not properly 
structured. For instance in you example, 'id' is an attribute of 
'config' nodes, that's fine, but for 'panel' nodes it's a child node.
There's no point using a node when only one 'id' can be specified. 
Filtering by attributes is much easier than by child nodes.

Anyway here's an example of using beautifulSoup:
python 2.7 (fix the print statement if you're using python3)

import bs4

xmlp = bs4.BeautifulSoup(open('test.xml', 'r'), 'xml')

# print all canvas
for cfg in xmlp.findAll('config'):
     print cfg.canvas.text

# find config B panel 6 coordinates
xmlp.find('config', id='B').find(lambda node: node.name=='panel' and 
node.id.text=='6').coordinates.text

# if panel id were attributes:
xmlp.find('config', id='B').find('panel', id='6').coordinates.text

If you can change the layout of the xml file it's better that you do, 
put every values as attribute whenever you can:


<config id="A" canvas="3840,1024">
   <comment> comments can span
      on multiple lines, you probably need a node
   </comment>
   <panel id="1" gate="6" coordinates="0,0,1280,124"/>
<config>


Properly structured xml will yield proper python code.

cheers,

JM





[toc] | [prev] | [next] | [standalone]


#100741

FromKP <kai.peters@gmail.com>
Date2015-12-22 11:03 -0800
Message-ID<ff10cb6b-88f0-4f11-836a-a73cfbaed115@googlegroups.com>
In reply to#100718
Thank you both - your help is much appreciated!

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web