Path: csiph.com!eternal-september.org!feeder.eternal-september.org!mx02.eternal-september.org!.POSTED!not-for-mail From: Jussi Piitulainen Newsgroups: comp.lang.python Subject: Re: delete from pattern to pattern if it contains match Date: Mon, 25 Apr 2016 13:24:48 +0300 Organization: A noiseless patient Spider Lines: 121 Message-ID: References: <20c0b0fe-136b-4b01-b004-c55c6d47b299@googlegroups.com> <91432d7b-7233-4504-a725-22bc81637ea3@googlegroups.com> <991c5867-27d1-4e75-aa52-a7d47e626b74@googlegroups.com> <8001ac2b-c883-4ca1-a163-d118cc82295b@googlegroups.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Injection-Info: mx02.eternal-september.org; posting-host="305c68510616a2e7ac08bcd2ff1598bd"; logging-data="25166"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18aXM/jLoGNNNzlo7rcXXxvA/SHdANF/wA=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) Cancel-Lock: sha1:flk4lcUaZlxLbta1glVxb9KxZJk= sha1:uedxwASMC3E4mc0vGSQ14FPplCI= Xref: csiph.com comp.lang.python:107594 harirammanohar@gmail.com writes: > On Monday, April 25, 2016 at 12:47:14 PM UTC+5:30, Jussi Piitulainen wrote: >> harirammanohar@gmail.com writes: >> >> > Hi Jussi, >> > >> > i have seen you have written a definition to fulfill the requirement, >> > can we do this same thing using xml parser, as i have failed to >> > implement the thing using xml parser of python if the file is having >> > the content as below... >> > >> > > > PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" >> > "http://java.sun.com/dtd/web-app_2_3.dtd"> >> > >> > >> > >> > and entire thing works if it has as below: >> > > > >> > >> > what i observe is xml tree parsing is not working if http tags are >> > there in between web-app... >> >> Do you get an error message? >> >> My guess is that the parser needs the DTD but cannot access it. There >> appears to be a DTD at that address, http://java.sun.com/... (it >> redirects to Oracle, who bought Sun a while ago), but something might >> prevent the parser from accessing it by default. If so, the details >> depend on what parser you are trying to use. It may be possible to save >> that DTD as a local file and point the parser to that. >> >> Your problem is morphing rather wildly. A previous version had namespace >> declarations but no DTD or XSD if I remember right. The initial version >> wasn't XML at all. >> >> If you post (1) an actual, minimal document, (2) the actual Python >> commands that fail to parse it, and (3) the error message you get, >> someone will be able to help you. The content of the document need not >> be more than "hello, world" level. The DOCTYPE declaration and the >> outermost tags with all their attributes and namespace declarations, if >> any, are important. > > Hi Jussi, > > Here is an input file...sample.xml > > > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee > http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd" > version="3.1"> > > controller > com.mycompany.mypackage.ControllerServlet > > listOrders > com.mycompany.myactions.ListOrdersAction > > > saveCustomer > com.mycompany.myactions.SaveCustomerAction > > 5 > > > > > graph > /graph > > > > > 30 > > > > -------------------------------- > Here is the code: > > import xml.etree.ElementTree as ET > ET.register_namespace("", "http://xmlns.jcp.org/xml/ns/javaee") > tree = ET.parse('sample.xml') > root = tree.getroot() > > for servlet in root.findall('servlet'): > servletname = servlet.find('servlet-name').text > if servletname == "controller": > root.remove(servlet) > > tree.write('output.xml') > > This will work if doesnt have below... > > xmlns="http://xmlns.jcp.org/xml/ns/javaee" > xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" > xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee > http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd" It's a namespace issue, and your method of registering a default namespace isn't working. It's a frustrating failure mode: no error message, no nothing :) Try defining a namespace prefix in your method calls, and using that prefix in element names: ns = { 'x' : "http://xmlns.jcp.org/xml/ns/javaee" } for servlet in root.findall('x:servlet', ns): servletname = servlet.find('x:servlet-name', ns).text I got this from here: https://docs.python.org/3/library/xml.etree.elementtree.html#parsing-xml-with-namespaces Note that the namespace prefix - I chose to use 'x' - has no meaning. It's the association of the prefix that you use to the URI that is the name of the namespace that does the job.