Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #107246 > unrolled thread

delete from pattern to pattern if it contains match

Started byharirammanohar@gmail.com
First post2016-04-18 00:07 -0700
Last post2016-04-25 10:19 +0000
Articles 9 on this page of 29 — 4 participants

Back to article view | Back to comp.lang.python


Contents

  delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-18 00:07 -0700
    RE: delete from pattern to pattern if it contains match Joaquin Alzola <Joaquin.Alzola@lebara.com> - 2016-04-18 07:49 +0000
      Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-18 01:52 -0700
      Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-18 21:01 -0700
    Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-21 03:17 -0700
      Re: delete from pattern to pattern if it contains match Peter Otten <__peter__@web.de> - 2016-04-21 13:24 +0200
        Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-22 02:00 -0700
          Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-22 02:14 -0700
            Re: delete from pattern to pattern if it contains match Peter Otten <__peter__@web.de> - 2016-04-22 11:50 +0200
              Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-24 23:24 -0700
      Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-21 16:32 +0300
        Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-22 01:59 -0700
          Re: delete from pattern to pattern if it contains match Peter Otten <__peter__@web.de> - 2016-04-22 11:24 +0200
            Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-22 14:10 +0300
              Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-24 23:29 -0700
                Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-25 10:17 +0300
                  Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-25 02:49 -0700
                    Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-25 02:53 -0700
                      Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-25 13:37 +0300
                    Re: delete from pattern to pattern if it contains match Peter Otten <__peter__@web.de> - 2016-04-25 12:13 +0200
                      Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-25 13:39 +0300
                        Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-25 04:02 -0700
                          Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-25 14:28 +0300
                            Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-25 04:40 -0700
                              Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-25 15:00 +0300
                              Re: delete from pattern to pattern if it contains match Peter Otten <__peter__@web.de> - 2016-04-25 14:33 +0200
                                Re: delete from pattern to pattern if it contains match harirammanohar@gmail.com - 2016-04-26 03:31 -0700
                    Re: delete from pattern to pattern if it contains match Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-04-25 13:24 +0300
                    RE: delete from pattern to pattern if it contains match Joaquin Alzola <Joaquin.Alzola@lebara.com> - 2016-04-25 10:19 +0000

Page 2 of 2 — ← Prev page 1 [2]


#107596

FromJussi Piitulainen <jussi.piitulainen@helsinki.fi>
Date2016-04-25 13:39 +0300
Message-ID<lf5h9eqrmzx.fsf@ling.helsinki.fi>
In reply to#107593
Peter Otten writes:

> harirammanohar@gmail.com wrote:
>
>> Here is the code:
>
> Finally ;)

:)

[toc] | [prev] | [next] | [standalone]


#107597

Fromharirammanohar@gmail.com
Date2016-04-25 04:02 -0700
Message-ID<95f0d9a7-69ff-43bf-a856-8fa62fe8a985@googlegroups.com>
In reply to#107596
On Monday, April 25, 2016 at 4:09:26 PM UTC+5:30, Jussi Piitulainen wrote:
> Peter Otten writes:
> 
> > harirammanohar@gmail.com wrote:
> >
> >> Here is the code:
> >
> > Finally ;)
> 
> :)

name space issue can be resolved registering name space i have no issue with that, only concern is xml parser has no effect when http things are added...

[toc] | [prev] | [next] | [standalone]


#107598

FromJussi Piitulainen <jussi.piitulainen@helsinki.fi>
Date2016-04-25 14:28 +0300
Message-ID<lf5d1perkqn.fsf@ling.helsinki.fi>
In reply to#107597
harirammanohar@gmail.com writes:

> On Monday, April 25, 2016 at 4:09:26 PM UTC+5:30, Jussi Piitulainen wrote:
>> Peter Otten writes:
>> 
>> > harirammanohar@gmail.com wrote:
>> >
>> >> Here is the code:
>> >
>> > Finally ;)
>> 
>> :)
>
> name space issue can be resolved registering name space i have no
> issue with that, only concern is xml parser has no effect when http
> things are added...

No, the parser works fine. Your attempt to register a default namespace
didn't work. Those "http things" *are* the namespace issue!

The following version of your code works. *Try it.* It finds the servlet
element in the document object, removes it, and writes out XML text
without the servlet element. (It seems to invent another namespace
prefix. That doesn't change the meaning of the document.)

import xml.etree.ElementTree as ET

ns = { 'x' : "http://xmlns.jcp.org/xml/ns/javaee" }

tree = ET.parse('sample.xml')
root = tree.getroot()

for servlet in root.findall('x:servlet', ns):
    servletname = servlet.find('x:servlet-name', ns).text
    if servletname == "controller":
        root.remove(servlet)

tree.write('output.xml')

[toc] | [prev] | [next] | [standalone]


#107599

Fromharirammanohar@gmail.com
Date2016-04-25 04:40 -0700
Message-ID<15f2e9ff-7624-4886-bcd9-c3e9d21db328@googlegroups.com>
In reply to#107598
On Monday, April 25, 2016 at 4:58:15 PM UTC+5:30, Jussi Piitulainen wrote:
> harirammanohar@gmail.com writes:
> 
> > On Monday, April 25, 2016 at 4:09:26 PM UTC+5:30, Jussi Piitulainen wrote:
> >> Peter Otten writes:
> >> 
> >> > harirammanohar@gmail.com wrote:
> >> >
> >> >> Here is the code:
> >> >
> >> > Finally ;)
> >> 
> >> :)
> >
> > name space issue can be resolved registering name space i have no
> > issue with that, only concern is xml parser has no effect when http
> > things are added...
> 
> No, the parser works fine. Your attempt to register a default namespace
> didn't work. Those "http things" *are* the namespace issue!
> 
> The following version of your code works. *Try it.* It finds the servlet
> element in the document object, removes it, and writes out XML text
> without the servlet element. (It seems to invent another namespace
> prefix. That doesn't change the meaning of the document.)
> 
> import xml.etree.ElementTree as ET
> 
> ns = { 'x' : "http://xmlns.jcp.org/xml/ns/javaee" }
> 
> tree = ET.parse('sample.xml')
> root = tree.getroot()
> 
> for servlet in root.findall('x:servlet', ns):
>     servletname = servlet.find('x:servlet-name', ns).text
>     if servletname == "controller":
>         root.remove(servlet)
> 
> tree.write('output.xml')

yup its working well if i include register namespace, else i am getting ns:0  in every line of output.xml.

But its removing top line
<?xml version="1.0" encoding="ISO-8859-1"?>

[toc] | [prev] | [next] | [standalone]


#107601

FromJussi Piitulainen <jussi.piitulainen@helsinki.fi>
Date2016-04-25 15:00 +0300
Message-ID<lf58u01sxti.fsf@ling.helsinki.fi>
In reply to#107599
harirammanohar@gmail.com writes:

> On Monday, April 25, 2016 at 4:58:15 PM UTC+5:30, Jussi Piitulainen wrote:
>> harirammanohar@gmail.com writes:
>> 
>> > On Monday, April 25, 2016 at 4:09:26 PM UTC+5:30, Jussi Piitulainen wrote:
>> >> Peter Otten writes:
>> >> 
>> >> > harirammanohar@gmail.com wrote:
>> >> >
>> >> >> Here is the code:
>> >> >
>> >> > Finally ;)
>> >> 
>> >> :)
>> >
>> > name space issue can be resolved registering name space i have no
>> > issue with that, only concern is xml parser has no effect when http
>> > things are added...
>> 
>> No, the parser works fine. Your attempt to register a default namespace
>> didn't work. Those "http things" *are* the namespace issue!
>> 
>> The following version of your code works. *Try it.* It finds the servlet
>> element in the document object, removes it, and writes out XML text
>> without the servlet element. (It seems to invent another namespace
>> prefix. That doesn't change the meaning of the document.)
>> 
>> import xml.etree.ElementTree as ET
>> 
>> ns = { 'x' : "http://xmlns.jcp.org/xml/ns/javaee" }
>> 
>> tree = ET.parse('sample.xml')
>> root = tree.getroot()
>> 
>> for servlet in root.findall('x:servlet', ns):
>>     servletname = servlet.find('x:servlet-name', ns).text
>>     if servletname == "controller":
>>         root.remove(servlet)
>> 
>> tree.write('output.xml')
>
> yup its working well if i include register namespace, else i am
> getting ns:0 in every line of output.xml.

That's a namespace prefix for each element name that is in the default
namespace. If the ET.register_namespace has the effect of making that
the default namespace in the output, fine, you can use it.

The important thing is that you can read your output.xml back in, using
the XML parser, and it has the intended meaning.

> But its removing top line
> <?xml version="1.0" encoding="ISO-8859-1"?>

Not a problem. You can still read your output.xml back in, using the XML
parser, and it will have the same meaning as it would have had with this
declaration.

[toc] | [prev] | [next] | [standalone]


#107602

FromPeter Otten <__peter__@web.de>
Date2016-04-25 14:33 +0200
Message-ID<mailman.72.1461587643.32212.python-list@python.org>
In reply to#107599
harirammanohar@gmail.com wrote:

>> tree.write('output.xml')
> 
> yup its working well if i include register namespace, else i am getting
> ns:0  in every line of output.xml.
> 
> But its removing top line
> <?xml version="1.0" encoding="ISO-8859-1"?>

The write() method allows you to specify an encoding and/or require an xml 
declaration:

https://docs.python.org/dev/library/xml.etree.elementtree.html#xml.etree.ElementTree.ElementTree.write

[toc] | [prev] | [next] | [standalone]


#107647

Fromharirammanohar@gmail.com
Date2016-04-26 03:31 -0700
Message-ID<32834ac1-cae6-4783-a3a9-f8a4ba5ad77d@googlegroups.com>
In reply to#107602
On Monday, April 25, 2016 at 6:04:24 PM UTC+5:30, Peter Otten wrote:
> harirammanohar@gmail.com wrote:
> 
> >> tree.write('output.xml')
> > 
> > yup its working well if i include register namespace, else i am getting
> > ns:0  in every line of output.xml.
> > 
> > But its removing top line
> > <?xml version="1.0" encoding="ISO-8859-1"?>
> 
> The write() method allows you to specify an encoding and/or require an xml 
> declaration:
> 
> https://docs.python.org/dev/library/xml.etree.elementtree.html#xml.etree.ElementTree.ElementTree.write

Hi Peter,

Thanks for reminding about basic write method syntax...its working :)


<?xml version='1.0' encoding='iso-8859-1'?>
<web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="3.1" xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee                        http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd">
    <servlet-mapping>
      <servlet-name>graph</servlet-name>
      <url-pattern>/graph</url-pattern>
    </servlet-mapping>


    <session-config>
      <session-timeout>30</session-timeout>
    </session-config>


Here is the change:

tree.write('output.xml',encoding="ISO-8859-1",xml_declaration=True)

Thank you all especially jussi and pete..

[toc] | [prev] | [next] | [standalone]


#107594

FromJussi Piitulainen <jussi.piitulainen@helsinki.fi>
Date2016-04-25 13:24 +0300
Message-ID<lf5poternnz.fsf@ling.helsinki.fi>
In reply to#107589
harirammanohar@gmail.com writes:

> On Monday, April 25, 2016 at 12:47:14 PM UTC+5:30, Jussi Piitulainen wrote:
>> harirammanohar@gmail.com writes:
>> 
>> > Hi Jussi,
>> >
>> > i have seen you have written a definition to fulfill the requirement,
>> > can we do this same thing using xml parser, as i have failed to
>> > implement the thing using xml parser of python if the file is having
>> > the content as below...
>> >
>> > <!DOCTYPE web-app 
>> >     PUBLIC "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" 
>> >     "http://java.sun.com/dtd/web-app_2_3.dtd">
>> >
>> > <web-app>
>> >
>> > and entire thing works if it has as below:
>> > <!DOCTYPE web-app 
>> > <web-app>
>> >
>> > what i observe is xml tree parsing is not working if http tags are
>> > there in between web-app...
>> 
>> Do you get an error message?
>> 
>> My guess is that the parser needs the DTD but cannot access it. There
>> appears to be a DTD at that address, http://java.sun.com/... (it
>> redirects to Oracle, who bought Sun a while ago), but something might
>> prevent the parser from accessing it by default. If so, the details
>> depend on what parser you are trying to use. It may be possible to save
>> that DTD as a local file and point the parser to that.
>> 
>> Your problem is morphing rather wildly. A previous version had namespace
>> declarations but no DTD or XSD if I remember right. The initial version
>> wasn't XML at all.
>> 
>> If you post (1) an actual, minimal document, (2) the actual Python
>> commands that fail to parse it, and (3) the error message you get,
>> someone will be able to help you. The content of the document need not
>> be more than "hello, world" level. The DOCTYPE declaration and the
>> outermost tags with all their attributes and namespace declarations, if
>> any, are important.
>
> Hi Jussi,
>
> Here is an input file...sample.xml
>
> <?xml version="1.0" encoding="ISO-8859-1"?>
> <web-app xmlns="http://xmlns.jcp.org/xml/ns/javaee"
>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>   xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
>                       http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"
>   version="3.1">
>     <servlet>
>       <servlet-name>controller</servlet-name>
>       <servlet-class>com.mycompany.mypackage.ControllerServlet</servlet-class>
>       <init-param>
>         <param-name>listOrders</param-name>
>         <param-value>com.mycompany.myactions.ListOrdersAction</param-value>
>       </init-param>
>       <init-param>
>         <param-name>saveCustomer</param-name>
>         <param-value>com.mycompany.myactions.SaveCustomerAction</param-value>
>       </init-param>
>       <load-on-startup>5</load-on-startup>
>     </servlet>
>
>
>     <servlet-mapping>
>       <servlet-name>graph</servlet-name>
>       <url-pattern>/graph</url-pattern>
>     </servlet-mapping>
>
>
>     <session-config>
>       <session-timeout>30</session-timeout>
>     </session-config>
> </web-app>
>
> --------------------------------
> Here is the code:
>
> import xml.etree.ElementTree as ET
> ET.register_namespace("", "http://xmlns.jcp.org/xml/ns/javaee")
> tree = ET.parse('sample.xml')
> root = tree.getroot()
>
> for servlet in root.findall('servlet'):
>         servletname = servlet.find('servlet-name').text
>         if servletname == "controller":
>                 root.remove(servlet)
>
> tree.write('output.xml')
>
> This will work if <web-app> </web-app> doesnt have below...
>
> xmlns="http://xmlns.jcp.org/xml/ns/javaee"
>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>   xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
>                       http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"

It's a namespace issue, and your method of registering a default
namespace isn't working. It's a frustrating failure mode: no error
message, no nothing :)

Try defining a namespace prefix in your method calls, and using that
prefix in element names:

ns = { 'x' : "http://xmlns.jcp.org/xml/ns/javaee" }

for servlet in root.findall('x:servlet', ns):
    servletname = servlet.find('x:servlet-name', ns).text

I got this from here:
https://docs.python.org/3/library/xml.etree.elementtree.html#parsing-xml-with-namespaces

Note that the namespace prefix - I chose to use 'x' - has no meaning.
It's the association of the prefix that you use to the URI that is the
name of the namespace that does the job.

[toc] | [prev] | [next] | [standalone]


#107600

FromJoaquin Alzola <Joaquin.Alzola@lebara.com>
Date2016-04-25 10:19 +0000
Message-ID<mailman.71.1461585209.32212.python-list@python.org>
In reply to#107589
I put some code I did before for the xmlns:

                              xml_root = ET.ElementTree(ET.fromstring(xml_decoded)).getroot()
                              for elem in xml_root.getiterator():
                                             if('{http://request.messagepush.interfaces.comviva.com/xsd}shortCode'==elem.tag):
                                                            shortCode = (elem.text).rstrip()
                                             if('{http://request.messagepush.interfaces.comviva.com/xsd}text'==elem.tag):
                                                            send_text = (elem.text).rstrip()
                                             if('{http://request.messagepush.interfaces.comviva.com/xsd}item'==elem.tag):
                                                            subscribers = (elem.text).rstrip()
                              result_sms = send_sms(subscribers,shortCode,send_text)

Reuse it.

-----Original Message-----
From: Python-list [mailto:python-list-bounces+joaquin.alzola=lebara.com@python.org] On Behalf Of Peter Otten
Sent: 25 April 2016 11:14
To: python-list@python.org
Subject: Re: delete from pattern to pattern if it contains match

harirammanohar@gmail.com wrote:

> Here is the code:

Finally ;)

> import xml.etree.ElementTree as ET
> ET.register_namespace("", "http://xmlns.jcp.org/xml/ns/javaee")

I don't know what this does, but probably not what you expected.

> tree = ET.parse('sample.xml')
> root = tree.getroot()
>
> for servlet in root.findall('servlet'):
>         servletname = servlet.find('servlet-name').text

I think you have to specify the namespace:

for servlet in root.findall('{http://xmlns.jcp.org/xml/ns/javaee}servlet'):
    servletname = servlet.find(
        '{http://xmlns.jcp.org/xml/ns/javaee}servlet-name').text

>         if servletname == "controller":

You could have added a print statement to verify that the line below is executed.

>                 root.remove(servlet)
>
> tree.write('output.xml')
>
> This will work if <web-app> </web-app> doesnt have below...
>
> xmlns="http://xmlns.jcp.org/xml/ns/javaee"
>   xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>   xsi:schemaLocation="http://xmlns.jcp.org/xml/ns/javaee
>                       http://xmlns.jcp.org/xml/ns/javaee/web-app_3_1.xsd"



--
https://mail.python.org/mailman/listinfo/python-list
This email is confidential and may be subject to privilege. If you are not the intended recipient, please do not copy or disclose its content but contact the sender immediately upon receipt.

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web