Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #31469 > unrolled thread
| Started by | Tharanga Abeyseela <tharanga.abeyseela@gmail.com> |
|---|---|
| First post | 2012-10-17 16:47 +1100 |
| Last post | 2012-10-17 09:01 +0200 |
| Articles | 3 — 3 participants |
Back to article view | Back to comp.lang.python
ElementTree Issue - Search and remove elements Tharanga Abeyseela <tharanga.abeyseela@gmail.com> - 2012-10-17 16:47 +1100
Re: ElementTree Issue - Search and remove elements Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-10-17 08:25 +0200
Re: ElementTree Issue - Search and remove elements Stefan Behnel <stefan_ml@behnel.de> - 2012-10-17 09:01 +0200
| From | Tharanga Abeyseela <tharanga.abeyseela@gmail.com> |
|---|---|
| Date | 2012-10-17 16:47 +1100 |
| Subject | ElementTree Issue - Search and remove elements |
| Message-ID | <mailman.2323.1350452831.27098.python-list@python.org> |
Hi Guys,
I need to remove the parent node, if a particular match found.
ex:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<Feed xmlns="http://schemas.xxxx.xx/xx/2011/06/13/xx">
<TVEpisode>
<Provider>0x5</Provider>
<ItemId>http://fxxxxxxl</ItemId>
<Title>WWE</Title>
<SortTitle>WWE </SortTitle>
<Description>WWE</Description>
<IsUserGenerated>false</IsUserGenerated>
<Images>
<Image>
<ImagePurpose>BoxArt</ImagePurpose>
<Url>https://xxxxxx.xx/@006548-thumb.jpg</Url>
</Image>
</Images>
<LastModifiedDate>2012-10-16T00:00:19.814+11:00</LastModifiedDate>
<Genres>
<Genre>xxxxx</Genre>
</Genres>
<ParentalControl>
<System>xxxx</System>
<Rating>M</Rating>
if i found <Rating>NC</Rating>, i need to remove the <TVEpisode> from
the XML. i have TVseries,Movies,and several items. (they also have
Rating element). i need to remove all if i found the NC keyword.inside
<Ratging>
im using following code.
when i do the following on python shell i can see the result (NC,M,etc)
>>> x[1].text
'NC'
but when i do this inside the script, im getting the following error.
Traceback (most recent call last):
File "./test.py", line 10, in ?
x = child.find('Rating').text
AttributeError: 'NoneType' object has no attribute 'text'
but how should i remove the parent node if i found the string "NC" i
need to do this for all elements (TVEpisode,Movies,TVshow etc)
how can i use python to remove the parent node if that string found.
(not only TVEpisodes, but others as well)
#!/usr/bin/env python
import elementtree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for child in root.findall(".//{http://schemas.CCC.com/CCC/2011/06/13/CC}Rating"):
x = child.find('Rating').text
if child[1].text == 'NC':
print "found"
root.remove('TVEpisode') ?????
tree.write('output.xml')
Really appreciate your thoughts on this.
Thanks in advance,
Tharanga
[toc] | [next] | [standalone]
| From | Alain Ketterlin <alain@dpt-info.u-strasbg.fr> |
|---|---|
| Date | 2012-10-17 08:25 +0200 |
| Message-ID | <87pq4hbonj.fsf@dpt-info.u-strasbg.fr> |
| In reply to | #31469 |
Tharanga Abeyseela <tharanga.abeyseela@gmail.com> writes:
> I need to remove the parent node, if a particular match found.
It looks like you can't get the parent of an Element with elementtree (I
would love to be proven wrong on this).
The solution is to find all nodes that have a Rating (grand-) child, and
then test explicitly for the value you're looking for.
> <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
> <Feed xmlns="http://schemas.xxxx.xx/xx/2011/06/13/xx">
> <TVEpisode>
[...]
> <ParentalControl>
> <System>xxxx</System>
> <Rating>M</Rating>
> for child in root.findall(".//{http://schemas.CCC.com/CCC/2011/06/13/CC}Rating"):
> x = child.find('Rating').text
> if child[1].text == 'NC':
> print "found"
> root.remove('TVEpisode') ?????
Your code doesn't work because findall() already returns Rating
elements, and these have no Rating child (so your first call to find()
fails, i.e., returns None). And list indexes starts at 0, btw.
Also, Rating is not a child of TVEpisode, it is a child of
ParentalControl.
Here is my suggestion:
# Find nodes having a ParentalControl child
for child in root.findall(".//*[ParentalControl]"):
x = child.find("ParentalControl/Rating").text
if x == "NC":
...
Note that a complete XPath implementation would make that simpler: your
query basically is //*[ParentalControl/Rating=="NC"]
-- Alain.
[toc] | [prev] | [next] | [standalone]
| From | Stefan Behnel <stefan_ml@behnel.de> |
|---|---|
| Date | 2012-10-17 09:01 +0200 |
| Message-ID | <mailman.2328.1350457332.27098.python-list@python.org> |
| In reply to | #31475 |
Alain Ketterlin, 17.10.2012 08:25: > It looks like you can't get the parent of an Element with elementtree (I > would love to be proven wrong on this). No, that's by design. ElementTree allows you to reuse subtrees in a document, for example, which wouldn't work if you enforced a single parent. Also, keeping parent references out simplifies the tree structure considerably, saves space and time and all that. ElementTree is really great for what it does. If you need to access the parent more often in a read-only tree, you can quickly build up a back reference dict that maps each Element to its parent by traversing the tree once. Alternatively, use lxml.etree, in which Elements have a getparent() method and in which single parents are enforced (also by design). Stefan
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web