Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #26152 > unrolled thread

Search and replace text in XML file?

Started bytodd.tabern@gmail.com
First post2012-07-27 18:23 -0700
Last post2012-07-28 15:32 -0700
Articles 7 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  Search and replace text in XML file? todd.tabern@gmail.com - 2012-07-27 18:23 -0700
    Re: Search and replace text in XML file? Jason Friedman <jason@powerpull.net> - 2012-07-27 20:07 -0600
    Re: Search and replace text in XML file? Terry Reedy <tjreedy@udel.edu> - 2012-07-28 03:17 -0400
      Re: Search and replace text in XML file? Paul Rudin <paul.nospam@rudin.co.uk> - 2012-07-31 15:45 +0100
    Re: Search and replace text in XML file? Mark Lawrence <breamoreboy@yahoo.co.uk> - 2012-07-28 09:29 +0100
    Re: Search and replace text in XML file? MRAB <python@mrabarnett.plus.com> - 2012-07-28 17:01 +0100
    Re: Search and replace text in XML file? Tim Roberts <timr@probo.com> - 2012-07-28 15:32 -0700

#26152 — Search and replace text in XML file?

Fromtodd.tabern@gmail.com
Date2012-07-27 18:23 -0700
SubjectSearch and replace text in XML file?
Message-ID<5535f70f-4542-451f-8b08-a62b45d15c11@googlegroups.com>
I'm looking to search an entire XML file for specific text and replace that text, while maintaining the structure of the XML file. The text occurs within multiple nodes throughout the file.
I basically need to replace every occurrence C:\Program Files with C:\Program Files (x86), regardless of location. For example, that text appears within: 
<URL>C:\Program Files\\Map Data\Road_Centerlines.shp</URL>
and also within: 
<RoutingIndexPathName>C:\Program Files\Templates\RoadNetwork.rtx</RoutingIndexPathName>
...among others.
I've tried some non-python methods and they all ruined the XML structure. I've been Google searching all day and can only seem to find solutions that look for a specific node and replace the whole string between the tags. 
I've been looking at using minidom to achieve this but I just can't seem to figure out the right method.
My end goal, once I have working code, is to compile an exe that can work on machines without python, allowing a user can click in order to perform the XML modification.
Thanks in advance.

[toc] | [next] | [standalone]


#26153

FromJason Friedman <jason@powerpull.net>
Date2012-07-27 20:07 -0600
Message-ID<mailman.2665.1343441271.4697.python-list@python.org>
In reply to#26152
> I'm looking to search an entire XML file for specific text and replace that text, while maintaining the structure of the XML file. The text occurs within multiple nodes throughout the file.
> I basically need to replace every occurrence C:\Program Files with C:\Program Files (x86), regardless of location. For example, that text appears within:
> <URL>C:\Program Files\\Map Data\Road_Centerlines.shp</URL>
> and also within:
> <RoutingIndexPathName>C:\Program Files\Templates\RoadNetwork.rtx</RoutingIndexPathName>
> ...among others.
> I've tried some non-python methods and they all ruined the XML structure. I've been Google searching all day and can only seem to find solutions that look for a specific node and replace the whole string between the tags.
> I've been looking at using minidom to achieve this but I just can't seem to figure out the right method.
> My end goal, once I have working code, is to compile an exe that can work on machines without python, allowing a user can click in order to perform the XML modification.

Is it as simple as this?

$ cat my.xml
<URL>C:\Program Files\\Map Data\Road_Centerlines.shp</URL>
and also within:
<RoutingIndexPathName>C:\Program
Files\Templates\RoadNetwork.rtx</RoutingIndexPathName>

$ python3
Python 3.2.3 (default, May  3 2012, 15:51:42)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> for line in open("my.xml"):
...     print(line.replace("C:\Program Files", "C:\Program Files
(x86)"), end="")
...
<URL>C:\Program Files (x86)\\Map Data\Road_Centerlines.shp</URL>
and also within:
<RoutingIndexPathName>C:\Program Files
(x86)\Templates\RoadNetwork.rtx</RoutingIndexPathName>

This solution assumes that you have no tags containing the string
"C:\Program Files".

[toc] | [prev] | [next] | [standalone]


#26154

FromTerry Reedy <tjreedy@udel.edu>
Date2012-07-28 03:17 -0400
Message-ID<mailman.2666.1343459910.4697.python-list@python.org>
In reply to#26152
On 7/27/2012 9:23 PM, todd.tabern@gmail.com wrote:
> I'm looking to search an entire XML file for specific text and
> replace that text, while maintaining the structure of the XML file.

For a one-off project, or for experimentation, I would use a proper 
text-editor* and run through with search/replace. For automation, use 
Jason's suggestion of .replace, perhaps on a line-by-line basis.

* By definition, such only make changes you request. Notepad++ is one 
such on Windows. You do not want a word or document processor, or, for 
simple xml oblivious text substitutions, an xml processor. Such things 
often make changes of various sorts.

-- 
Terry Jan Reedy


[toc] | [prev] | [next] | [standalone]


#26317

FromPaul Rudin <paul.nospam@rudin.co.uk>
Date2012-07-31 15:45 +0100
Message-ID<87sjc8yoo2.fsf@no-fixed-abode.cable.virginmedia.net>
In reply to#26154
Terry Reedy <tjreedy@udel.edu> writes:

> ... a proper text-editor*

>  * ... Notepad++ is one such on Windows.

Surely emacs is the only such on any platform? :)

[toc] | [prev] | [next] | [standalone]


#26155

FromMark Lawrence <breamoreboy@yahoo.co.uk>
Date2012-07-28 09:29 +0100
Message-ID<mailman.2667.1343464174.4697.python-list@python.org>
In reply to#26152
On 28/07/2012 08:17, Terry Reedy wrote:
> On 7/27/2012 9:23 PM, todd.tabern@gmail.com wrote:
>> I'm looking to search an entire XML file for specific text and
>> replace that text, while maintaining the structure of the XML file.
>
> For a one-off project, or for experimentation, I would use a proper
> text-editor* and run through with search/replace. For automation, use
> Jason's suggestion of .replace, perhaps on a line-by-line basis.
>
> * By definition, such only make changes you request. Notepad++ is one
> such on Windows. You do not want a word or document processor, or, for
> simple xml oblivious text substitutions, an xml processor. Such things
> often make changes of various sorts.
>

I highly recommend the use of notepad++.  If anyone knows of a better 
text editor for Windows please let me know :)

-- 
Cheers.

Mark Lawrence.

[toc] | [prev] | [next] | [standalone]


#26157

FromMRAB <python@mrabarnett.plus.com>
Date2012-07-28 17:01 +0100
Message-ID<mailman.2669.1343491287.4697.python-list@python.org>
In reply to#26152
On 28/07/2012 09:29, Mark Lawrence wrote:
> On 28/07/2012 08:17, Terry Reedy wrote:
>> On 7/27/2012 9:23 PM, todd.tabern@gmail.com wrote:
>>> I'm looking to search an entire XML file for specific text and
>>> replace that text, while maintaining the structure of the XML file.
>>
>> For a one-off project, or for experimentation, I would use a proper
>> text-editor* and run through with search/replace. For automation, use
>> Jason's suggestion of .replace, perhaps on a line-by-line basis.
>>
>> * By definition, such only make changes you request. Notepad++ is one
>> such on Windows. You do not want a word or document processor, or, for
>> simple xml oblivious text substitutions, an xml processor. Such things
>> often make changes of various sorts.
>>
>
> I highly recommend the use of notepad++.  If anyone knows of a better
> text editor for Windows please let me know :)
>
My own preference is EditPad. I liked EditPad Lite so much that I
bought EditPad Pro.

[toc] | [prev] | [next] | [standalone]


#26167

FromTim Roberts <timr@probo.com>
Date2012-07-28 15:32 -0700
Message-ID<r0q8181ah72qk9kgu26mv0823lmsakokf1@4ax.com>
In reply to#26152
todd.tabern@gmail.com wrote:
>
>I basically need to replace every occurrence C:\Program Files with
>C:\Program Files (x86), regardless of location. For example, that 
>text appears within: 
><URL>C:\Program Files\\Map Data\Road_Centerlines.shp</URL>
>and also within: 
><RoutingIndexPathName>C:\Program Files\Templates\RoadNetwork.rtx</RoutingIndexPathName>
>...among others.
>I've tried some non-python methods and they all ruined the XML structure. 

I don't see how that's possible.  XML doesn't have any character counts,
and it doesn't care about extra white space.  A rock-stupid editor
substitution should have been able to do this in two seconds.
-- 
Tim Roberts, timr@probo.com
Providenza & Boekelheide, Inc.

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web