Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #48836
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!goblin2!goblin.stu.neva.ru!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <jsf80238@gmail.com> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.005 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'else:': 0.03; 'output': 0.05; 'attribute': 0.07; 'subject:file': 0.07; 'string': 0.09; '<?xml': 0.09; 'path)': 0.09; 'subject:string': 0.09; '{},': 0.09; 'python': 0.11; 'def': 0.12; 'be:': 0.16; 'subject:XML': 0.16; 'val,': 0.16; 'appears': 0.22; 'code,': 0.22; 'looks': 0.24; 'tried': 0.27; 'xml': 0.29; 'message-id:@mail.gmail.com': 0.30; "i'm": 0.30; 'getting': 0.31; 'lines': 0.31; '"': 0.31; 'node': 0.31; 'skip:= 20': 0.31; 'file': 0.32; 'subject:all': 0.32; 'skip:& 30': 0.33; 'to:name:python-list': 0.33; 'actual': 0.34; 'skip:d 20': 0.34; 'received:google.com': 0.35; 'really': 0.36; '8bit%:9': 0.36; 'leads': 0.36; 'skip:& 10': 0.38; 'thank': 0.38; 'to:addr:python-list': 0.38; 'skip:& 20': 0.39; 'to:addr:python.org': 0.39; 'skip:p 20': 0.39; 'even': 0.60; 'skip:a 30': 0.61; 'you.': 0.62; 'times': 0.62; 'such': 0.63; 'skip:n 10': 0.64; '8bit%:10': 0.64; '2000': 0.65; 'levels': 0.65; 'close': 0.67 |
| DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=geS7AGQYVVfA6pHi4EtMwt6HwQ7D+8aZ/FRaFE6u4/E=; b=fXInLVDuNBPHTe4R7NDvgu20prkQhIdnZ5lsSkvdvPn4O0SiIplb3XmXR0J/Nlt6ve QgKyU+5T17ZKrugdUuy9SItimEzi2d9PjxcNlQ+iVdKSHtjBtvXr6N31pNWl1AvhrI19 Cweyc/GQ+Liw7SSlaO8qDNFr0pOJM8LDOPmjE7LCXr7Lrm7YmTMHuzblNMDJfq6YoOUv DDu4lirIQMOYaW9PvGOV9puv0dVCp2rXKOuroAd7+90XLzf11oQWej5segiXwMDiOgTe /6ookIylLs3QvXupc2e/iTKtbOa1r5g6tdFCKxC9Kg7TYNgeO6dezlixNuFWT2XZle2/ nwmQ== |
| MIME-Version | 1.0 |
| X-Received | by 10.50.36.10 with SMTP id m10mr1005612igj.31.1371778206062; Thu, 20 Jun 2013 18:30:06 -0700 (PDT) |
| Date | Thu, 20 Jun 2013 19:30:06 -0600 |
| Subject | Finding all instances of a string in an XML file |
| From | Jason Friedman <jsf80238@gmail.com> |
| To | python-list <python-list@python.org> |
| Content-Type | multipart/alternative; boundary=089e013c69bcaee2bd04dfa000b1 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.3648.1371778210.3114.python-list@python.org> (permalink) |
| Lines | 98 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1371778210 news.xs4all.nl 15963 [2001:888:2000:d::a6]:33082 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:48836 |
Show key headers only | View raw
[Multipart message — attachments visible in raw view] - view raw
I have XML which looks like:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE KMART SYSTEM "my.dtd">
<LEVEL_1>
<LEVEL_2 ATTR="hello">
<ATTRIBUTE NAME="Property X" VALUE ="2"/>
</LEVEL_2>
<LEVEL_2 ATTR="goodbye">
<ATTRIBUTE NAME="Property Y" VALUE ="NULL"/>
<LEVEL_3 ATTR="aloha">
<ATTRIBUTE NAME="Property X" VALUE ="3"/>
</LEVEL_3>
<ATTRIBUTE NAME="Property Z" VALUE ="welcome"/>
</LEVEL_2>
</LEVEL_1>
The "Property X" string appears twice times and I want to output the "path"
that leads to all such appearances. In this case the output would be:
LEVEL_1 {}, LEVEL_2 {"ATTR": "hello"}, ATTRIBUTE {"NAME": "Property X",
"VALUE": "2"}
LEVEL_1 {}, LEVEL_2 {"ATTR": "goodbye"}, LEVEL_3 {"ATTR": "aloha"},
ATTRIBUTE {"NAME": "Property X", "VALUE": "3"}
My actual XML file is 2000 lines and contains up to 8 levels of nesting.
I have tried this so far (partial code, using the xml.etree.ElementTree
module):
def get_path(data_dictionary, val, path):
for node in data_dictionary[CHILDREN]:
if node[CHILDREN]:
if not path or node[TAG] != path[-1]:
path.append(node[TAG])
print(CR + "recursing ...")
get_path(node, val, path)
else:
for k,v in node[ATTRIB].items():
if v == val:
print("path- ",path)
print("---- " + node[TAG] + " " + str(node[ATTRIB]))
I'm really not even close to getting the output I am looking for.
Python 3.2.2.
Thank you.
Back to comp.lang.python | Previous | Next | Find similar | Unroll thread
Finding all instances of a string in an XML file Jason Friedman <jsf80238@gmail.com> - 2013-06-20 19:30 -0600
csiph-web