Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #40953 > unrolled thread

read and write the same text file

Started byiMath <redstone-cold@163.com>
First post2013-03-09 07:36 -0800
Last post2013-03-10 03:45 +0000
Articles 3 — 3 participants

Back to article view | Back to comp.lang.python


Contents

  read and write the same text file iMath <redstone-cold@163.com> - 2013-03-09 07:36 -0800
    Re: read and write the same text file Roy Smith <roy@panix.com> - 2013-03-09 10:47 -0500
      Re: read and write the same text file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-03-10 03:45 +0000

#40953 — read and write the same text file

FromiMath <redstone-cold@163.com>
Date2013-03-09 07:36 -0800
Subjectread and write the same text file
Message-ID<c41028f7-e457-4a2b-99f4-c473eaadd128@googlegroups.com>
read and write the same text file 
Open a text file ,read the content ,then make some change on it ,then write it back to the file ,now the modified text  should only has the modified content but not the initial content ,so can we implement this by only set the mode parameter with open() function ?if yes ,what the parameter should be ?if no ,can we implement this by only one with statement ?
I implement this with 2 with statement as the following 

replace_pattern = re.compile(r"<.+?>",re.DOTALL)

def text_process(file):

    with open(file,'r') as f:
        text = f.read()

    with open(file,'w') as f:
        f.write(replace_pattern.sub('',text))

[toc] | [next] | [standalone]


#40954

FromRoy Smith <roy@panix.com>
Date2013-03-09 10:47 -0500
Message-ID<roy-422051.10473409032013@70-1-84-166.pools.spcsdns.net>
In reply to#40953
In article <c41028f7-e457-4a2b-99f4-c473eaadd128@googlegroups.com>,
 iMath <redstone-cold@163.com> wrote:

> read and write the same text file 
> Open a text file ,read the content ,then make some change on it ,then write 
> it back to the file ,now the modified text  should only has the modified 
> content but not the initial content ,so can we implement this by only set the 
> mode parameter with open() function ?if yes ,what the parameter should be ?if 
> no ,can we implement this by only one with statement ?
> I implement this with 2 with statement as the following 
> 
> replace_pattern = re.compile(r"<.+?>",re.DOTALL)
> 
> def text_process(file):
> 
>     with open(file,'r') as f:
>         text = f.read()
> 
>     with open(file,'w') as f:
>         f.write(replace_pattern.sub('',text))

At a minimum, you need to close the file after you read it and before 
you re-open it for writing.

There's a variety of ways you could achieve the same effect.  You might 
open the file once, in read-write mode, read the contents, rewind to the 
beginning with seek(), then write the new contents.  You might also 
write the modified data out to a new file, close it, and then rename it.  
But, open, read, close, open, write, close is the most straight-forward.

[toc] | [prev] | [next] | [standalone]


#40995

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-03-10 03:45 +0000
Message-ID<513c01cf$0$6512$c3e8da3$5496439d@news.astraweb.com>
In reply to#40954
On Sat, 09 Mar 2013 10:47:34 -0500, Roy Smith wrote:

> In article <c41028f7-e457-4a2b-99f4-c473eaadd128@googlegroups.com>,
>  iMath <redstone-cold@163.com> wrote:

>> def text_process(file):
>>     with open(file,'r') as f:
>>         text = f.read()
>>     with open(file,'w') as f:
>>         f.write(replace_pattern.sub('',text))
> 
> At a minimum, you need to close the file after you read it and before
> you re-open it for writing.

The "with" statement automatically does that. When the indented block  -- 
in this case, a single line "text = f.read()" -- completes, f.close() is 
automatically called.

Technically, this is a property of file objects, not the with statement. 
The with statement merely calls the magic __exit__ method, which for file 
objects closes the file.


> There's a variety of ways you could achieve the same effect.  You might
> open the file once, in read-write mode, read the contents, rewind to the
> beginning with seek(), then write the new contents.  You might also
> write the modified data out to a new file, close it, and then rename it.
> But, open, read, close, open, write, close is the most straight-forward.

All of those techniques are acceptable for quick and dirty scripts. But 
for professional applications, you want something which is resistant to 
data loss. The problems happens when you do this:

1. Read file.
2. Modify data.
3. Open file for writing. # This deletes contents of the file!
4. Write data.
5. Flush data to the hard drive.

Notice that there is a brief time frame where the old data is destroyed, 
but the new data hasn't been saved to the hard drive yet. This is where 
you can get data loss if, say, the power goes off.

So a professional-quality application should do something like this:

1. Read file.
2. Modify data.
3. Open a new file for writing.
4. Write data to this new file.
5. Flush data to the hard drive.
6. Atomically replace the original file with the new file.

Since sensible operating systems promise to be able to replace as a 
single atomic step, guaranteed to either fully succeed or not occur at 
all, this reduces the risk of data corruption or data loss significantly.

Even if you're using an OS that doesn't guarantee atomic replacement 
*cough* Windows *cough* it still decreases the risk of data loss, just 
not as much as we would like.

But having said all that, for "normal" use, a simple read-write cycle is 
still pretty safe, and it's *much* simpler to get right.



-- 
Steven

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web