Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #25213 > unrolled thread

Re: How to safely maintain a status file

Started byChristian Heimes <lists@cheimes.de>
First post2012-07-12 15:05 +0200
Last post2012-07-14 14:38 +0200
Articles 4 — 3 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: How to safely maintain a status file Christian Heimes <lists@cheimes.de> - 2012-07-12 15:05 +0200
    Re: How to safely maintain a status file Ross Ridge <rridge@csclub.uwaterloo.ca> - 2012-07-12 11:48 -0400
    Re: How to safely maintain a status file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-07-13 01:52 +0000
      Re: How to safely maintain a status file Christian Heimes <lists@cheimes.de> - 2012-07-14 14:38 +0200

#25213 — Re: How to safely maintain a status file

FromChristian Heimes <lists@cheimes.de>
Date2012-07-12 15:05 +0200
SubjectRe: How to safely maintain a status file
Message-ID<mailman.2038.1342098831.4697.python-list@python.org>
Am 12.07.2012 14:30, schrieb Laszlo Nagy:
> This is not a contradiction. Although the rename operation is atomic,
> the whole "change status" process is not. It is because there are two
> operations: #1 delete old status file and #2. rename the new status
> file. And because there are two operations, there is still a race
> condition. I see no contradiction here.

Sorry, but you are wrong. It's just one operation that boils down to
"point name to a different inode". After the rename op the file name
either points to a different inode or still to the old name in case of
an error. The OS guarantees that all processes either see the first or
second state (in other words: atomic).

POSIX has no operation that actually deletes a file. It just has an
unlink() syscall that removes an associated name from an inode. As soon
as an inode has no names and is not references by a file descriptor, the
file content and inode is removed by the operating system. rename() is
more like a link() followed by an unlink() wrapped in a system wide
global lock.

> It is not entirely true. We are talking about two processes. One is
> reading a file, another one is writting it. They can run at the same
> time, so flushing disk cache forcedly won't help.

You need to flush the data to disk as well as the metadata of the file
and its directory in order to survive a system crash. The close()
syscall already makes sure that all data is flushed into the IO layer of
the operating system.

With POSIX semantics the reading process will either see the full
content before the rename op or the full content after the rename op.
The writing process can replace the name (rename op) while the reading
process reads the status file because its file descriptor still points
to the old status file.

Christian

[toc] | [next] | [standalone]


#25224

FromRoss Ridge <rridge@csclub.uwaterloo.ca>
Date2012-07-12 11:48 -0400
Message-ID<jtmrkr$5oc$1@rumours.uwaterloo.ca>
In reply to#25213
Laszlo Nagy:
> This is not a contradiction. Although the rename operation is atomic,
> the whole "change status" process is not. It is because there are two
> operations: #1 delete old status file and #2. rename the new status
> file. And because there are two operations, there is still a race
> condition. I see no contradiction here.

Christian Heimes  <lists@cheimes.de> wrote:
>Sorry, but you are wrong. It's just one operation that boils down to
>"point name to a different inode".

For some reason you're assuming POSIX semantics, an assumption that
Laszlo Nagy did not make.

					Ross Ridge

-- 
 l/  //	  Ross Ridge -- The Great HTMU
[oo][oo]  rridge@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //	  

[toc] | [prev] | [next] | [standalone]


#25234

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-07-13 01:52 +0000
Message-ID<4fff7f4c$0$29965$c3e8da3$5496439d@news.astraweb.com>
In reply to#25213
On Thu, 12 Jul 2012 15:05:26 +0200, Christian Heimes wrote:

> You need to flush the data to disk as well as the metadata of the file
> and its directory in order to survive a system crash. The close()
> syscall already makes sure that all data is flushed into the IO layer of
> the operating system.

And some storage devices (e.g. hard drives, USB sticks) don't actually 
write data permanently even when you sync the device. They just write to 
a temporary cache, then report that they are done (liar liar pants on 
fire). Only when the cache is full, or at some random time at the 
device's choosing, do they actually write data to the physical media. 

The result of this is that even when the device tells you that the data 
is synched, it may not be.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#25311

FromChristian Heimes <lists@cheimes.de>
Date2012-07-14 14:38 +0200
Message-ID<mailman.2113.1342269522.4697.python-list@python.org>
In reply to#25234
Am 13.07.2012 03:52, schrieb Steven D'Aprano:
> And some storage devices (e.g. hard drives, USB sticks) don't actually 
> write data permanently even when you sync the device. They just write to 
> a temporary cache, then report that they are done (liar liar pants on 
> fire). Only when the cache is full, or at some random time at the 
> device's choosing, do they actually write data to the physical media. 
> 
> The result of this is that even when the device tells you that the data 
> is synched, it may not be.

Yes, that's another issue. Either you have to buy expensive enterprise
hardware with UPS batteries or you need to compensate for failures on
software level (e.g. Hadoop cluster).

We have big storage devices with double redundant controllers, on board
buffer batteries, triple redundant power supplies, special RAID disks,
multipath IO fiber channel links and external backup solution to keep
our data reasonable safe.

Christian

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web