Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #25210 > unrolled thread

Re: How to safely maintain a status file

Started byLaszlo Nagy <gandalf@shopzeus.com>
First post2012-07-12 14:30 +0200
Last post2012-07-14 01:53 +0000
Articles 20 on this page of 22 — 12 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: How to safely maintain a status file Laszlo Nagy <gandalf@shopzeus.com> - 2012-07-12 14:30 +0200
    Re: How to safely maintain a status file Hans Mulder <hansmu@xs4all.nl> - 2012-07-12 15:19 +0200
      Re: How to safely maintain a status file Laszlo Nagy <gandalf@shopzeus.com> - 2012-07-12 19:43 +0200
      Re: How to safely maintain a status file Christian Heimes <lists@cheimes.de> - 2012-07-12 20:39 +0200
        Re: How to safely maintain a status file Rick Johnson <rantingrickjohnson@gmail.com> - 2012-07-12 18:20 -0700
          Re: How to safely maintain a status file Chris Angelico <rosuav@gmail.com> - 2012-07-13 12:12 +1000
            Re: How to safely maintain a status file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-13 03:13 +0000
              Re: How to safely maintain a status file Gene Heskett <gheskett@wdtv.com> - 2012-07-12 23:49 -0400
                Re: How to safely maintain a status file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-13 04:21 +0000
              Re: How to safely maintain a status file rantingrickjohnson@gmail.com - 2012-07-12 21:26 -0700
                Re: How to safely maintain a status file Chris Angelico <rosuav@gmail.com> - 2012-07-13 16:02 +1000
                Re: How to safely maintain a status file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-13 07:14 +0000
                  Re: How to safely maintain a status file Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-07-13 13:29 -0400
                RE: How to safely maintain a status file "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-07-13 16:00 +0000
                Re: [Python] RE: How to safely maintain a status file Chris Gonnerman <chris@gonnerman.org> - 2012-07-13 12:27 -0500
                RE: [Python] RE: How to safely maintain a status file "Prasad, Ramit" <ramit.prasad@jpmorgan.com> - 2012-07-13 17:59 +0000
                  Re: [Python] RE: How to safely maintain a status file Hans Mulder <hansmu@xs4all.nl> - 2012-07-13 20:28 +0200
                    Re: [Python] RE: How to safely maintain a status file MRAB <python@mrabarnett.plus.com> - 2012-07-13 20:57 +0100
                    Re: [Python] RE: How to safely maintain a status file Christian Heimes <lists@cheimes.de> - 2012-07-13 22:21 +0200
                Re: [Python] RE: How to safely maintain a status file Chris Angelico <rosuav@gmail.com> - 2012-07-14 04:19 +1000
                RE: How to safely maintain a status file Chris Gonnerman <chris@gonnerman.org> - 2012-07-13 15:15 -0500
                  Re: How to safely maintain a status file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-14 01:53 +0000

Page 1 of 2  [1] 2  Next page →


#25210 — Re: How to safely maintain a status file

FromLaszlo Nagy <gandalf@shopzeus.com>
Date2012-07-12 14:30 +0200
SubjectRe: How to safely maintain a status file
Message-ID<mailman.2034.1342096256.4697.python-list@python.org>
> You are contradicting yourself. Either the OS is providing a fully
> atomic rename or it doesn't. All POSIX compatible OS provide an atomic
> rename functionality that renames the file atomically or fails without
> loosing the target side. On POSIX OS it doesn't matter if the target exists.
This is not a contradiction. Although the rename operation is atomic, 
the whole "change status" process is not. It is because there are two 
operations: #1 delete old status file and #2. rename the new status 
file. And because there are two operations, there is still a race 
condition. I see no contradiction here.
>
> You don't need locks or any other fancy stuff. You just need to make
> sure that you flush the data and metadata correctly to the disk and
> force a re-write of the directory inode, too. It's a standard pattern on
> POSIX platforms and well documented in e.g. the maildir RFC.
It is not entirely true. We are talking about two processes. One is 
reading a file, another one is writting it. They can run at the same 
time, so flushing disk cache forcedly won't help.

[toc] | [next] | [standalone]


#25214

FromHans Mulder <hansmu@xs4all.nl>
Date2012-07-12 15:19 +0200
Message-ID<4ffecef4$0$6877$e4fe514c@news2.news.xs4all.nl>
In reply to#25210
On 12/07/12 14:30:41, Laszlo Nagy wrote:
>> You are contradicting yourself. Either the OS is providing a fully
>> atomic rename or it doesn't. All POSIX compatible OS provide an atomic
>> rename functionality that renames the file atomically or fails without
>> loosing the target side. On POSIX OS it doesn't matter if the target
>> exists.

> This is not a contradiction. Although the rename operation is atomic,
> the whole "change status" process is not. It is because there are two
> operations: #1 delete old status file and #2. rename the new status
> file. And because there are two operations, there is still a race
> condition. I see no contradiction here.

On Posix systems, you can avoid the race condition.  The trick is to
skip step #1.  The rename will implicitly delete the old file, and
it will still be atomic.  The whole process now consists of a single
stop, so the whole process is now atomic.

>> You don't need locks or any other fancy stuff. You just need to make
>> sure that you flush the data and metadata correctly to the disk and
>> force a re-write of the directory inode, too. It's a standard pattern on
>> POSIX platforms and well documented in e.g. the maildir RFC.

> It is not entirely true. We are talking about two processes. One is
> reading a file, another one is writting it. They can run at the same
> time, so flushing disk cache forcedly won't help.

On Posix systems, it will work, and be atomic, even if one process is
reading the old status file while another process is writing the new
one.  The old file will be atomically removed from the directory by
the rename operation; it will continue to exists on the hard drive, so
that the reading process can continue reading it.  The old file will
be deleted when the reader closes it.  Or, if the system crashed before
the old file is closed, it will deleted when the system is restarted.

On Windows, things are very different.

Hope this helps,

-- HansM

[toc] | [prev] | [next] | [standalone]


#25228

FromLaszlo Nagy <gandalf@shopzeus.com>
Date2012-07-12 19:43 +0200
Message-ID<mailman.2052.1342115003.4697.python-list@python.org>
In reply to#25214
>> This is not a contradiction. Although the rename operation is atomic,
>> the whole "change status" process is not. It is because there are two
>> operations: #1 delete old status file and #2. rename the new status
>> file. And because there are two operations, there is still a race
>> condition. I see no contradiction here.
> On Posix systems, you can avoid the race condition.  The trick is to
> skip step #1.  The rename will implicitly delete the old file, and
> it will still be atomic.  The whole process now consists of a single
> stop, so the whole process is now atomic.
Well, I didn't know that this is going to work. At least it does not 
work on Windows 7 (which should be POSIX compatible?)

 >>> f = open("test.txt","wb+")
 >>> f.close()
 >>> f2 = open("test2.txt","wb+")
 >>> f2.close()
 >>> import os
 >>> os.rename("test2.txt","test.txt")
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
WindowsError: [Error 183] File already exists
 >>>

I have also tried this on FreeBSD and it worked.

Now, let's go back to the original question:

>>>This works well on Linux but Windows raises an error when status_file already exists.

It SEEMS that the op wanted a solution for Windows....

[toc] | [prev] | [next] | [standalone]


#25232

FromChristian Heimes <lists@cheimes.de>
Date2012-07-12 20:39 +0200
Message-ID<mailman.2059.1342118365.4697.python-list@python.org>
In reply to#25214
Am 12.07.2012 19:43, schrieb Laszlo Nagy:
> Well, I didn't know that this is going to work. At least it does not
> work on Windows 7 (which should be POSIX compatible?)

Nope, Windows's file system layer is not POSIX compatible. For example
you can't remove or replace a file while it is opened by a process.
Lot's of small things work slightly differently on Windows or not at all.

Christian

[toc] | [prev] | [next] | [standalone]


#25233

FromRick Johnson <rantingrickjohnson@gmail.com>
Date2012-07-12 18:20 -0700
Message-ID<fd846619-7de7-4c3b-8ed6-a14e9d55956e@p6g2000yqg.googlegroups.com>
In reply to#25232
On Jul 12, 2:39 pm, Christian Heimes <li...@cheimes.de> wrote:
> Windows's file system layer is not POSIX compatible. For example
> you can't remove or replace a file while it is opened by a process.

Sounds like a reasonable fail-safe to me. Not much unlike a car
ignition that will not allow starting the engine if the transmission
is in any *other* gear besides "park" or "neutral", OR a governor (be
it mechanical or electrical) that will not allow the engine RPMs to
exceed a maximum safe limit, OR even, ABS systems which "pulse" the
brakes to prevent overzealous operators from loosing road-to-tire
traction when decelerating the vehicle.

You could say: "Hey, if someone is dumb enough to shoot themselves in
the foot then let them"... however, sometimes fail-safes not only save
the dummy from a life of limps, they also prevent catastrophic
"collateral damage" to rest of us.

[toc] | [prev] | [next] | [standalone]


#25237

FromChris Angelico <rosuav@gmail.com>
Date2012-07-13 12:12 +1000
Message-ID<mailman.2061.1342145524.4697.python-list@python.org>
In reply to#25233
On Fri, Jul 13, 2012 at 11:20 AM, Rick Johnson
<rantingrickjohnson@gmail.com> wrote:
> On Jul 12, 2:39 pm, Christian Heimes <li...@cheimes.de> wrote:
>> Windows's file system layer is not POSIX compatible. For example
>> you can't remove or replace a file while it is opened by a process.
>
> Sounds like a reasonable fail-safe to me.

POSIX says that files and file names are independent. I can open a
file based on its name, delete the file based on its name, and still
have the open file there. When it's closed, it'll be wiped from the
disk.

ChrisA

[toc] | [prev] | [next] | [standalone]


#25241

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-07-13 03:13 +0000
Message-ID<4fff926b$0$29965$c3e8da3$5496439d@news.astraweb.com>
In reply to#25237
On Fri, 13 Jul 2012 12:12:01 +1000, Chris Angelico wrote:

> On Fri, Jul 13, 2012 at 11:20 AM, Rick Johnson
> <rantingrickjohnson@gmail.com> wrote:
>> On Jul 12, 2:39 pm, Christian Heimes <li...@cheimes.de> wrote:
>>> Windows's file system layer is not POSIX compatible. For example you
>>> can't remove or replace a file while it is opened by a process.
>>
>> Sounds like a reasonable fail-safe to me.

Rick has obviously never tried to open a file for reading when somebody 
else has it opened, also for reading, and discovered that despite Windows 
being allegedly a multi-user operating system, you can't actually have 
multiple users read the same files at the same time.

(At least not unless the application takes steps to allow it.)

Or tried to back-up files while some application has got them opened. Or 
open a file while an anti-virus scanner is oh-so-slooooowly scanning it.

Opening files for exclusive read *by default* is a pointless and silly 
limitation. It's also unsafe: if a process opens a file for exclusive 
read, and then dies, *no other process* can close that file.

At least on POSIX systems, not even root can override a mandatory 
exclusive lock (it would be pretty pointless if it could), so a rogue or 
buggy program could wreck havoc with mandatory exclusive file locks. 
That's why Linux, by default, treats exclusive file locks as advisory 
(cooperative), not mandatory.

In general, file locking is harder than it sounds, with many traps for 
the unwary, and of course the semantics are dependent on both the 
operating system and the file system.

https://en.wikipedia.org/wiki/File_locking


> POSIX says that files and file names are independent. I can open a file
> based on its name, delete the file based on its name, and still have the
> open file there. When it's closed, it'll be wiped from the disk.

One neat trick is to open a file, then delete it from disk while it is 
still open. So long as your process is still running, you can write to 
this ghost file, as normal, but no other process can (easily) see it. And 
when your process ends, the file contents is automatically deleted.

This is remarkably similar to what Python does with namespaces and dicts:

# create a fake "file system"
ns = {'a': [], 'b': [], 'c': []}
# open a file
myfile = ns['a']
# write to it
myfile.append('some data')
# delete it from the "file system"
del ns['a']
# but I can still read and write to it
myfile.append('more data')
print(myfile[0])
# but anyone else will get an error if they try
another_file = ns['a']


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#25242

FromGene Heskett <gheskett@wdtv.com>
Date2012-07-12 23:49 -0400
Message-ID<mailman.2062.1342151748.4697.python-list@python.org>
In reply to#25241
On Thursday 12 July 2012 23:21:16 Steven D'Aprano did opine:

> On Fri, 13 Jul 2012 12:12:01 +1000, Chris Angelico wrote:
> > On Fri, Jul 13, 2012 at 11:20 AM, Rick Johnson
> > 
> > <rantingrickjohnson@gmail.com> wrote:
> >> On Jul 12, 2:39 pm, Christian Heimes <li...@cheimes.de> wrote:
> >>> Windows's file system layer is not POSIX compatible. For example you
> >>> can't remove or replace a file while it is opened by a process.
> >> 
> >> Sounds like a reasonable fail-safe to me.
> 
> Rick has obviously never tried to open a file for reading when somebody
> else has it opened, also for reading, and discovered that despite
> Windows being allegedly a multi-user operating system, you can't
> actually have multiple users read the same files at the same time.
> 
Chuckle.  That was one of the 'features' that os9 on the trs-80 color 
computer had back in the 80's, and it was clean and well done because of 
the locking model the random block file manager had in OS9 for 6809 cpu's, 
no relation to the Mac OS9 other than a similar name.  That color computer 
has a separate, text only video card I could plug in and display on an 80 
column amber screen monitor.

When I wanted to impress the visiting frogs, I often did something I have 
never been able to do on any other operating system since, start assembling 
a long assembly language file on one of the screens on the color monitor, 
hit the clear key to advance to the amber screen and start a listing on it 
of the assemblers output listing file.

Because the file locking was applied only to the sector (256 bytes on that 
machine) being written at the instant, the listing would fly by till it 
caught up with the assemblers output, running into the lock and then 
dutifully following along, one sector behind the assemblers output, until 
the assembly was finished.  That was in 1986 folks, and in the year of our 
Lord 2012, 26 years later, I still cannot do that in linux.  When I ask why 
not, the replies seem to think I'm from outer space.  Its apparently a 
concept that is not even attempted to be understood by the linux code 
carvers.

Something is drastically wrong with that picture IMO.

> (At least not unless the application takes steps to allow it.)
> 
> Or tried to back-up files while some application has got them opened.

That in fact, ran me out of the amiga business in 1999, a 30Gb drive failed 
on my full blown 040 + 64 megs of dram A2000.  When the warranty drive 
arrived is when I found that due to file locks on the startup files, all of 
them involved with the booting of that machine, my high priced Diavolo Pro 
backup tapes didn't contain a single one of those files.  The linux box 
with Red Hat 5.0 on it that I had built in late 1998 to see what linux was 
all about found space under that desk yet that evening and I never looked 
back.

> Or
> open a file while an anti-virus scanner is oh-so-slooooowly scanning it.
> 
> Opening files for exclusive read *by default* is a pointless and silly
> limitation. It's also unsafe: if a process opens a file for exclusive
> read, and then dies, *no other process* can close that file.
> 
> At least on POSIX systems, not even root can override a mandatory
> exclusive lock (it would be pretty pointless if it could), so a rogue or
> buggy program could wreck havoc with mandatory exclusive file locks.
> That's why Linux, by default, treats exclusive file locks as advisory
> (cooperative), not mandatory.
> 
> In general, file locking is harder than it sounds, with many traps for
> the unwary, and of course the semantics are dependent on both the
> operating system and the file system.
> 
> https://en.wikipedia.org/wiki/File_locking
> 
> > POSIX says that files and file names are independent. I can open a
> > file based on its name, delete the file based on its name, and still
> > have the open file there. When it's closed, it'll be wiped from the
> > disk.
> 
> One neat trick is to open a file, then delete it from disk while it is
> still open. So long as your process is still running, you can write to
> this ghost file, as normal, but no other process can (easily) see it.
> And when your process ends, the file contents is automatically deleted.
> 
> This is remarkably similar to what Python does with namespaces and
> dicts:
> 
> # create a fake "file system"
> ns = {'a': [], 'b': [], 'c': []}
> # open a file
> myfile = ns['a']
> # write to it
> myfile.append('some data')
> # delete it from the "file system"
> del ns['a']
> # but I can still read and write to it
> myfile.append('more data')
> print(myfile[0])
> # but anyone else will get an error if they try
> another_file = ns['a']

Cheers, Gene
-- 
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
My web page: <http://coyoteden.dyndns-free.com:85/gene> is up!
You just wait, I'll sin till I blow up!
		-- Dylan Thomas

[toc] | [prev] | [next] | [standalone]


#25244

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-07-13 04:21 +0000
Message-ID<4fffa254$0$29965$c3e8da3$5496439d@news.astraweb.com>
In reply to#25242
On Thu, 12 Jul 2012 23:49:02 -0400, Gene Heskett wrote:

> When I wanted to impress the visiting frogs, I often did something I
> have never been able to do on any other operating system since, start
> assembling a long assembly language file on one of the screens on the
> color monitor, hit the clear key to advance to the amber screen and
> start a listing on it of the assemblers output listing file.
> 
> Because the file locking was applied only to the sector (256 bytes on
> that machine) being written at the instant, the listing would fly by
> till it caught up with the assemblers output, running into the lock and
> then dutifully following along, one sector behind the assemblers output,
> until the assembly was finished.  That was in 1986 folks, and in the
> year of our Lord 2012, 26 years later, I still cannot do that in linux. 

Um, what you are describing sounds functionally equivalent to what 
tail -f does.


> When I ask why not, the replies seem to think I'm from outer space.  Its
> apparently a concept that is not even attempted to be understood by the
> linux code carvers.

You could certainly create a pair of cooperative programs, one which 
keeps a lock on only the last block of the file, and a tail-like reader 
which honours that lock. But why bother? Just have the assembler append 
to the file, and let people use any reader they like, such as tail.

Or have I misunderstood you?



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#25245

Fromrantingrickjohnson@gmail.com
Date2012-07-12 21:26 -0700
Message-ID<62e90523-a160-4e28-8109-9ce32d8952e1@googlegroups.com>
In reply to#25241
On Thursday, July 12, 2012 10:13:47 PM UTC-5, Steven D&#39;Aprano wrote:
> Rick has obviously never tried to open a file for reading when somebody 
> else has it opened, also for reading, and discovered that despite Windows 
> being allegedly a multi-user operating system, you can&#39;t actually have 
> multiple users read the same files at the same time.

You misread my response. My comment was direct result of Christian stating:

(paraphrase) "On some systems you are not permitted to delete a file whilst the file is open "

...which seems to be consistent to me. Why would *anybody* want to delete a file whilst the file is open? Bringing back the car analogy again: Would you consider jumping from a moving vehicle a consistent interaction with the interface of a vehicle? Of course not. The interface for a vehicle is simple and consistent:

 1. You enter the vehicle at location A
 2. The vehicle transports you to location B
 3. You exit the vehicle

At no time during the trip would anyone expect you to leap from the vehicle. But when you delete open files, you are essentially leaping from the moving vehicle! This behavior goes against all expectations of consistency in an API -- and against all sanity when riding in a vehicle!

> Opening files for exclusive read *by default* is a pointless and silly 
> limitation. It&#39;s also unsafe: if a process opens a file for exclusive 
> read, and then dies, *no other process* can close that file.

Oh come on. Are you actually going to use "errors" or "unintended consequences", or even "Acts of God" to defend your argument? Okay. Okay. I suppose "IF" the car spontaneously combusted "THEN" the passengers would be wise to jump out, leaving the vehicle to the whims of inertia.

> One neat trick is to open a file, then delete it from disk while it is 
> still open. So long as your process is still running, you can write to 
> this ghost file, as normal, but no other process can (easily) see it. And 
> when your process ends, the file contents is automatically deleted.

Well "neat tricks" aside, I am of the firm belief that deleting files should never be possible whilst they are open. 

 * Opening files requires that data exist on disk
 * Reading and writing files requires an open file obj
 * Closing files requires an open file object
 * And deleting files requires that the file NOT be open

Would you also entertain the idea of reading or writing files that do not exist? (not including pseudo file objs like StringIO of course!).

Summary: Neat tricks and Easter eggs are real hoot, but consistency in APIs is the key.

[toc] | [prev] | [next] | [standalone]


#25247

FromChris Angelico <rosuav@gmail.com>
Date2012-07-13 16:02 +1000
Message-ID<mailman.2064.1342159337.4697.python-list@python.org>
In reply to#25245
On Fri, Jul 13, 2012 at 2:26 PM,  <rantingrickjohnson@gmail.com> wrote:
> On Thursday, July 12, 2012 10:13:47 PM UTC-5, Steven D&#39;Aprano wrote:
>> Rick has obviously never tried to open a file for reading when somebody
>> else has it opened, also for reading, and discovered that despite Windows
>> being allegedly a multi-user operating system, you can&#39;t actually have
>> multiple users read the same files at the same time.
>
> You misread my response. My comment was direct result of Christian stating:
>
> (paraphrase) "On some systems you are not permitted to delete a file whilst the file is open "
>
> ...which seems to be consistent to me. Why would *anybody* want to delete a file whilst the file is open?

POSIX doesn't let you delete files. It lets you dispose of filenames.
Python does the same with its 'del'. The object (file) exists until
the system decides otherwise.

Here's a simpler example: Hardlinks. Suppose you have two names
pointing to the same file; are you allowed to unlink one of them while
you have the "other" open?

ChrisA

[toc] | [prev] | [next] | [standalone]


#25250

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-07-13 07:14 +0000
Message-ID<4fffcadc$0$29965$c3e8da3$5496439d@news.astraweb.com>
In reply to#25245
On Thu, 12 Jul 2012 21:26:20 -0700, rantingrickjohnson wrote:

> On Thursday, July 12, 2012 10:13:47 PM UTC-5, Steven D&#39;Aprano wrote:
>> Rick has obviously never tried to open a file for reading when somebody
>> else has it opened, also for reading, and discovered that despite
>> Windows being allegedly a multi-user operating system, you can&#39;t
>> actually have multiple users read the same files at the same time.
> 
> You misread my response. My comment was direct result of Christian
> stating:
> 
> (paraphrase) "On some systems you are not permitted to delete a file
> whilst the file is open "
> 
> ...which seems to be consistent to me. Why would *anybody* want to
> delete a file whilst the file is open? 

Because it is useful and a sensible thing to do.

Why should one misbehaved application, keeping a file open, be allowed to 
hold every other application, and the file system, hostage?

This is one of the many poor decisions which makes Windows so vulnerable 
to viruses and malware. If malware can arrange to keep itself open, you 
can't delete it. Thanks guys!


> Bringing back the car analogy
> again: Would you consider jumping from a moving vehicle a consistent
> interaction with the interface of a vehicle? Of course not. The
> interface for a vehicle is simple and consistent:
> 
>  1. You enter the vehicle at location A 
>  2. The vehicle transports you to location B 
>  3. You exit the vehicle

Amusingly, you neglected to specify "the vehicle stops" -- and rightly 
so, because of course having to stop the vehicle is not a *necessary* 
condition for exiting it, as tens of thousands of stunt men and women can 
attest.

Not to mention people parachuting out of an airplane, pirates or 
commandos boarding a moving ship, pedestrians transferring from a slow 
moving walkway to a faster moving walkway, farmers jumping off a trailer 
while it is still being towed behind a tractor (and jumping back on 
again), and Bruce Willis in "Red" in very possibly the best slow-motion 
action sequence in the history of Hollywood.

http://www.youtube.com/watch?v=xonMpj2YyDU


> At no time during the trip would anyone expect you to leap from the
> vehicle. 

Expected or not, you can do so.


> But when you delete open files, you are essentially leaping
> from the moving vehicle! This behavior goes against all expectations of
> consistency in an API -- and against all sanity when riding in a
> vehicle!

Fortunately, files on a file system are not cars, and deleting open files 
is a perfectly reasonable thing to do, no more frightening than in Python 
deleting a reference to an object using the del statement. Imagine how 
stupid it would be if this happened:


py> x = 42
py> y = x
py> del y
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
DeleteError: cannot delete reference to object '42' until no other 
references to it exist


Fortunately, Python doesn't do that -- it tracks when the object is no 
longer being accessed, and only then physically reclaims the memory used. 
And so it is on POSIX file systems: the file system keeps track of when 
the file on disk is no longer being accessed, and only then physically 
reclaims the blocks being used. Until then, deleting the file merely 
unlinks the file name from the blocks on disk, in the same way that 
"del y" merely unlinks the name y from the object 42.


>> Opening files for exclusive read *by default* is a pointless and silly
>> limitation. It&#39;s also unsafe: if a process opens a file for
>> exclusive read, and then dies, *no other process* can close that file.
> 
> Oh come on. Are you actually going to use "errors" or "unintended
> consequences", or even "Acts of God" to defend your argument? 

Features have to be judged by their actual consequences, not some 
unrealistic sense of theoretical purity. The actual consequences of 
mandatory exclusive file locking is, *it sucks*.

Windows users are used to having to reboot their server every few days 
because something is broken, so they might not mind rebooting it because 
some file is locked in a mandatory open state and not even the operating 
system can unlock it. But for those with proper operating systems who 
expect months of uninterrupted service, mandatory locking is a problem to 
be avoided, not a feature.


> Okay.
> Okay. I suppose "IF" the car spontaneously combusted "THEN" the
> passengers would be wise to jump out, leaving the vehicle to the whims
> of inertia.

In this analogy, is the car the file name, the inode, or the directory? 
Are the passengers the file name(s), or the file contents, or the inode? 
Is the driver meant to be the file system? If I have a hard link to the 
file, does that mean the passengers are in two cars at once, or two lots 
of passengers in the same car?


>> One neat trick is to open a file, then delete it from disk while it is
>> still open. So long as your process is still running, you can write to
>> this ghost file, as normal, but no other process can (easily) see it.
>> And when your process ends, the file contents is automatically deleted.
> 
> Well "neat tricks" aside, I am of the firm belief that deleting files
> should never be possible whilst they are open.

[condescension = ON]

Good for you Rick. Having strongly held opinions on things you have only 
a limited understanding about is your right as an American.

[condescension = OFF]


>  * Opening files requires that data exist on disk 
>  * Reading and writing files requires an open file obj 

You have missed a step in jumping from files on disk to open file objects.

Open file objects do not necessarily correspond to files on disk. For 
example, in standard Pascal, file objects are purely in-memory constructs 
emulating files on a tape drive, with no relationship to on-disk files.

(Any half-decent Pascal compiler or interpreter will *also* give you ways 
to access real files on disk, but that isn't covered by the standard.)

Even when the file object does come from an actual disk file, we can 
conclude that before you can open a file for reading, it must exist; but 
having opened it, there is no *necessary* requirement that it *remains* 
on disk. If you try to read from an open file object whose underlying 
file has been deleted, there are three perfectly reasonable behaviours:

- you get an error;

- it is the same result as if the file has been truncated to zero bytes;

- deleting the file only deletes the *name*, not contents, until the 
  last open file handle is shut, and then the contents are deleted.


>  * Closing files requires an open file object 

Naturally; but open file objects don't require that the on-disk file 
still exists.


>  * And deleting files requires that the file NOT be open

Not at all.


> Would you also entertain the idea of reading or writing files that do
> not exist? (not including pseudo file objs like StringIO of course!).

Define "file" and "exist". Because you are conflating at least three 
different things:

a file name
an inode (blocks on a disk)
a file object

Of course it is useful to break the abstraction that file objects must be 
files on disk. StringIO is one such example. Standard Pascal file objects 
is another. Likewise, it is useful to be able to read from an inode that 
is no longer connected to a file name.

So, absolutely, yes, it is useful to be able to read and write from files 
that don't exist, under some circumstances.


> Summary: Neat tricks and Easter eggs are real hoot, but consistency 
> in APIs is the key.

A foolish consistency is the hobgoblin of little minds.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#25276

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-07-13 13:29 -0400
Message-ID<mailman.2091.1342200621.4697.python-list@python.org>
In reply to#25250
On 13 Jul 2012 07:14:36 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> declaimed the following in
gmane.comp.python.general:

> 
> Open file objects do not necessarily correspond to files on disk. For 
> example, in standard Pascal, file objects are purely in-memory constructs 
> emulating files on a tape drive, with no relationship to on-disk files.
> 
> (Any half-decent Pascal compiler or interpreter will *also* give you ways 
> to access real files on disk, but that isn't covered by the standard.)
>
	Heck, in original Pascal, just reading from the console required
tricks (since Pascal I/O is defined with a 1 element look-ahead, and I/O
was opened at the start of execution). Mostly some internal flag to
delay console reads until the program actually accessed the buffer.

> a file name
> an inode (blocks on a disk)
> a file object
> 
> Of course it is useful to break the abstraction that file objects must be 
> files on disk. StringIO is one such example. Standard Pascal file objects 
> is another. Likewise, it is useful to be able to read from an inode that 
> is no longer connected to a file name.
>

	This does require a filesystem that uses "inode"s... Most computer
systems I've worked on didn't have that concept (they predated the POSIX
standard, and were not derived from UNIX).

	In the Amiga, the file name was stored in the file header block, not
in a directory block -- finding a file was done by hashing the file name
into an index in a directory block, and then following a linked list of
file header blocks until the desired name was found (and the file header
block contained pointers to the data blocks). {"Defragmenting" an Amiga
drive could get fun, as one not only moved data blocks around, but could
specify optimizations such as: position file header blocks immediately
before the file's data blocks [reducing head seeks when reading the
file] vs positioning file header blocks near their associated directory
block [reducing head seeks when reading file names]; reordering the
linked lists to put directory blocks ahead of file header blocks [making
directory navigation faster], etc.}

-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#25268

From"Prasad, Ramit" <ramit.prasad@jpmorgan.com>
Date2012-07-13 16:00 +0000
Message-ID<mailman.2082.1342195254.4697.python-list@python.org>
In reply to#25245
> Well "neat tricks" aside, I am of the firm belief that deleting files should
> never be possible whilst they are open.

This is one of the few instances I think Windows does something better 
than OS X. Windows will check before you attempt to delete (i.e. move
to Recycling Bin) while OS X will move a file to Trash quite happily
only tell me it cannot remove the file when I try to empty the Trash.

Ramit


Ramit Prasad | JPMorgan Chase Investment Bank | Currencies Technology
712 Main Street | Houston, TX 77002
work phone: 713 - 216 - 5423

--


This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  

[toc] | [prev] | [next] | [standalone]


#25278 — Re: [Python] RE: How to safely maintain a status file

FromChris Gonnerman <chris@gonnerman.org>
Date2012-07-13 12:27 -0500
SubjectRe: [Python] RE: How to safely maintain a status file
Message-ID<mailman.2093.1342201875.4697.python-list@python.org>
In reply to#25245
On 07/13/2012 11:00 AM, Prasad, Ramit wrote:
>> Well "neat tricks" aside, I am of the firm belief that deleting files should
>> never be possible whilst they are open.
> This is one of the few instances I think Windows does something better
> than OS X. Windows will check before you attempt to delete (i.e. move
> to Recycling Bin) while OS X will move a file to Trash quite happily
> only tell me it cannot remove the file when I try to empty the Trash.
While I was trained in the Unix way, and believe it is entirely 
appropriate to delete an open file.  Even if I my program is the opener. 
  It's just too handy to have temp files that disappear on their own.

As opposed to periodically going to %TEMP% and deleting them manually.  Gah.

-- Chris.

[toc] | [prev] | [next] | [standalone]


#25280 — RE: [Python] RE: How to safely maintain a status file

From"Prasad, Ramit" <ramit.prasad@jpmorgan.com>
Date2012-07-13 17:59 +0000
SubjectRE: [Python] RE: How to safely maintain a status file
Message-ID<mailman.2094.1342202431.4697.python-list@python.org>
In reply to#25245
> >> Well "neat tricks" aside, I am of the firm belief that deleting files
> should
> >> never be possible whilst they are open.
> > This is one of the few instances I think Windows does something better
> > than OS X. Windows will check before you attempt to delete (i.e. move
> > to Recycling Bin) while OS X will move a file to Trash quite happily
> > only tell me it cannot remove the file when I try to empty the Trash.
> While I was trained in the Unix way, and believe it is entirely
> appropriate to delete an open file.  Even if I my program is the opener.
>   It's just too handy to have temp files that disappear on their own.
> 
> As opposed to periodically going to %TEMP% and deleting them manually.  Gah.

In my experience things that are "too handy" are usually breaking
what I consider "right". That being said, I am not entirely sure
what I think is "right" in this circumstance. I suppose it depends
on if I am the person deleting or the person who is looking at
a file that is being deleted. Or the user who just wants the stupid
computer to just Work.

I lean slightly towards the POSIX handling with the addition that 
any additional write should throw an error. You are now saving to 
a file that will not exist the moment you close it and that is probably 
not expected.





Ramit
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or sale of
securities, accuracy and completeness of information, viruses,
confidentiality, legal privilege, and legal entity disclaimers,
available at http://www.jpmorgan.com/pages/disclosures/email.  

[toc] | [prev] | [next] | [standalone]


#25283 — Re: [Python] RE: How to safely maintain a status file

FromHans Mulder <hansmu@xs4all.nl>
Date2012-07-13 20:28 +0200
SubjectRe: [Python] RE: How to safely maintain a status file
Message-ID<500068bd$0$6893$e4fe514c@news2.news.xs4all.nl>
In reply to#25280
On 13/07/12 19:59:59, Prasad, Ramit wrote:

> I lean slightly towards the POSIX handling with the addition that 
> any additional write should throw an error. You are now saving to 
> a file that will not exist the moment you close it and that is
> probably not expected.

I'd say: it depends.

If the amount of data your script needs to process does not fit
in RAM, then you may want to write some of it to a temporary file.
On a Posix system, it's entirely normal to unlink() a temp file
first thing after you've created it.  The expectation is that the
file will continue to exists, and be writeable, until you close it.

In fact, there's a function in the standard library named
tempfile.TemporaryFile that does exactly that: create a file
and unlink it immediately.  This function would be useless
if you couldn't write to your temporary file.

Hope this helps,

-- HansM

[toc] | [prev] | [next] | [standalone]


#25286 — Re: [Python] RE: How to safely maintain a status file

FromMRAB <python@mrabarnett.plus.com>
Date2012-07-13 20:57 +0100
SubjectRe: [Python] RE: How to safely maintain a status file
Message-ID<mailman.2099.1342209459.4697.python-list@python.org>
In reply to#25283
On 13/07/2012 19:28, Hans Mulder wrote:
> On 13/07/12 19:59:59, Prasad, Ramit wrote:
>
>> I lean slightly towards the POSIX handling with the addition that
>> any additional write should throw an error. You are now saving to
>> a file that will not exist the moment you close it and that is
>> probably not expected.
>
Strictly speaking, the file does exist, it's just that there are no
names referring to it. When any handles to it are also closed, the file
_can_ truly be deleted.

As has been said before, in the *nix world, "unlink" _doesn't_ delete
a file, it deletes a name.

> I'd say: it depends.
>
> If the amount of data your script needs to process does not fit
> in RAM, then you may want to write some of it to a temporary file.
> On a Posix system, it's entirely normal to unlink() a temp file
> first thing after you've created it.  The expectation is that the
> file will continue to exists, and be writeable, until you close it.
>
> In fact, there's a function in the standard library named
> tempfile.TemporaryFile that does exactly that: create a file
> and unlink it immediately.  This function would be useless
> if you couldn't write to your temporary file.
>
It's possible to create a temporary file even in Windows.

[toc] | [prev] | [next] | [standalone]


#25289 — Re: [Python] RE: How to safely maintain a status file

FromChristian Heimes <lists@cheimes.de>
Date2012-07-13 22:21 +0200
SubjectRe: [Python] RE: How to safely maintain a status file
Message-ID<mailman.2101.1342210912.4697.python-list@python.org>
In reply to#25283
Am 13.07.2012 21:57, schrieb MRAB:
> It's possible to create a temporary file even in Windows.

Windows has a open() flag named O_TEMPORARY for temporary files. With
O_TEMPORARY the file is removed from disk as soon as the file handle is
closed. On POSIX OS it's common practice to unlink temporary files
immediately after the open() call.

[toc] | [prev] | [next] | [standalone]


#25282 — Re: [Python] RE: How to safely maintain a status file

FromChris Angelico <rosuav@gmail.com>
Date2012-07-14 04:19 +1000
SubjectRe: [Python] RE: How to safely maintain a status file
Message-ID<mailman.2096.1342203567.4697.python-list@python.org>
In reply to#25245
On Sat, Jul 14, 2012 at 3:59 AM, Prasad, Ramit
<ramit.prasad@jpmorgan.com> wrote:
> I lean slightly towards the POSIX handling with the addition that
> any additional write should throw an error. You are now saving to
> a file that will not exist the moment you close it and that is probably
> not expected.

There are several different possible "right behaviors" here, but they
depend more on the application than anything else. With a log file,
for instance, the act of deleting it is more a matter of truncating it
(dispose of the old history), so the right thing to do is to start a
fresh file. Solution: Close the file and re-open it periodically. But
I don't know of an efficient way to do that with Windows semantics.
Renaming/moving an open file in order to perform log rotation isn't
all that easy.

ChrisA

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.python


csiph-web