Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #43394 > unrolled thread

shutil.copyfile is incomplete (truncated)

Started byRob Schneider <rmschne@gmail.com>
First post2013-04-11 11:12 -0700
Last post2013-04-12 10:48 -0400
Articles 12 on this page of 32 — 11 participants

Back to article view | Back to comp.lang.python


Contents

  shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 11:12 -0700
    Re: shutil.copyfile is incomplete (truncated) Neil Cerutti <neilc@norwich.edu> - 2013-04-11 18:53 +0000
      Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 12:07 -0700
        Re: shutil.copyfile is incomplete (truncated) Neil Cerutti <neilc@norwich.edu> - 2013-04-11 19:55 +0000
          Re: shutil.copyfile is incomplete (truncated) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-12 00:06 +0000
            Re: shutil.copyfile is incomplete (truncated) Cameron Simpson <cs@zip.com.au> - 2013-04-12 11:15 +1000
            Re: shutil.copyfile is incomplete (truncated) Ned Deily <nad@acm.org> - 2013-04-11 18:33 -0700
              Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:32 -0700
                Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:53 -0700
                  Re: shutil.copyfile is incomplete (truncated) Ned Deily <nad@acm.org> - 2013-04-12 00:53 -0700
                Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:53 -0700
                Re: shutil.copyfile is incomplete (truncated) Cameron Simpson <cs@zip.com.au> - 2013-04-12 18:26 +1000
                  Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 02:18 -0700
                    Re: shutil.copyfile is incomplete (truncated) Chris Angelico <rosuav@gmail.com> - 2013-04-12 19:22 +1000
                      Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 05:07 -0700
                        Re: shutil.copyfile is incomplete (truncated) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-12 13:18 +0100
                      Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 05:07 -0700
                  Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 02:18 -0700
                    Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:57 -0400
                  Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:54 -0400
              Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:32 -0700
                Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:47 -0400
            Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:27 -0700
            Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:51 -0400
            Re: shutil.copyfile is incomplete (truncated) 88888 Dihedral <dihedral88888@googlemail.com> - 2013-04-12 08:49 -0700
            Re: shutil.copyfile is incomplete (truncated) Nobody <nobody@nowhere.com> - 2013-04-13 03:33 +0100
              Re: shutil.copyfile is incomplete (truncated) Chris Angelico <rosuav@gmail.com> - 2013-04-13 13:05 +1000
              [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-13 03:17 +0000
                Re: [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)] Chris Angelico <rosuav@gmail.com> - 2013-04-13 13:43 +1000
          Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:25 -0700
            Re: shutil.copyfile is incomplete (truncated) Chris Angelico <rosuav@gmail.com> - 2013-04-12 17:32 +1000
            Re: shutil.copyfile is incomplete (truncated) Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-12 10:48 -0400

Page 2 of 2 — ← Prev page 1 [2]


#43428

FromRob Schneider <rmschne@gmail.com>
Date2013-04-11 23:32 -0700
Message-ID<mailman.506.1365751267.3114.python-list@python.org>
In reply to#43412
> 
> > Or that the filesystem may be full? Of course, that's usually obvious
> 
> > more widely when it happens...
> 
> > 
> 
> > Question: is the size of the incomplete file a round number? (Like
> 
> > a multiple of a decent sized power of 2>)
> 
> 
> 
> Also on what OS X file system type does the file being created reside, 
> 
> in particular, is it a network file system?
> 

File system not full (2/3 of disk is free)

Source (correct one) is 47,970 bytes. Target after copy of 45,056 bytes.  I've tried changing what gets written to change the file size. It is usually this sort of difference.

The file system is Mac OS Extended Journaled (default as out of the box).  

[toc] | [prev] | [next] | [standalone]


#43456

FromRoy Smith <roy@panix.com>
Date2013-04-12 10:47 -0400
Message-ID<roy-0DAB01.10473112042013@news.panix.com>
In reply to#43428
In article <mailman.506.1365751267.3114.python-list@python.org>,
 Rob Schneider <rmschne@gmail.com> wrote:

> Source (correct one) is 47,970 bytes. Target after copy of 45,056 bytes.  
> I've tried changing what gets written to change the file size. It is usually 
> this sort of difference.
> 
> The file system is Mac OS Extended Journaled (default as out of the box).  

Is it always the tail end of the file that gets truncated, or is it 
missing (or mutating) data in the middle of the file?  I'm just grasping 
at straws here, but maybe it's somehow messing up line endings (turning 
CRLF pairs into just LF), or using some other kind of encoding for 
unicode characters?

If you compare the files with cmp, does it say:

$ cmp original truncated 
cmp: EOF on truncated

that's what I would expect if it's a strict truncation.  If it says 
anything else, you've got a data munging problem.

What I would normally do around this time is run a system call trace on 
the process to watch all the descriptor related (i.e. open, create, 
write) system calls.   On OSX, that means dtruss.  Unfortunately, I'm 
not that familiar with the OSX variant so I can't give you specific 
advice about which options to use.

When you can see the system calls, you know exactly what your process is 
doing.  You should be able to see the output file being opened and a 
descriptor returned, then find all the write() calls to that descriptor.  
You'll also be able to find any other system calls on that pathname 
after the descriptor is closed.

Please report back what you find!

Oh, another trick you might want to try is making the output file path 
/dev/stdout and redirecting the output into a file with the shell.  See 
if that makes any difference.  Or, try something like (assuming the -o 
option to your script sets the output filename):

python my_prog.py -o /dev/stdout | dd bs=1 of=xxx

That will do a couple of things.  First, dd will report how many bytes 
it read and wrote, so you can see if that's the correct number.  Also, 
since your process will no longer be writing to a real file, if anything 
is doing something weird like a seek() after you're done writing, that 
will fail since you can't seek() on a pipe.

[toc] | [prev] | [next] | [standalone]


#43423

FromRob Schneider <rmschne@gmail.com>
Date2013-04-11 23:27 -0700
Message-ID<e088d429-586d-4f90-88b9-4f73a0419280@googlegroups.com>
In reply to#43405
> I would consider the chance that the disk may be faulty, or the file 
> 
> system is corrupt. Does the problem go away if you write to a different 
> 
> file system or a different disk?
> 

It's a relatively new MacBook Pro with a solid state disk.  I've not noticed any other disk problems. I did a "repair permissions" (for what it's worth). Maybe I'll have it tested at the Genius Bar.  I don't have the full system on another computer to try that; but will work on that today.

[toc] | [prev] | [next] | [standalone]


#43458

FromRoy Smith <roy@panix.com>
Date2013-04-12 10:51 -0400
Message-ID<roy-81341B.10510012042013@news.panix.com>
In reply to#43405
In article <51674ffc$0$29977$c3e8da3$5496439d@news.astraweb.com>,
 Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote:

> On Thu, 11 Apr 2013 19:55:53 +0000, Neil Cerutti wrote:
> 
> > On 2013-04-11, Rob Schneider <rmschne@gmail.com> wrote:
> >> Thanks. Yes, there is a close function call  before the copy is
> >> launched. No other writes. Does Python wait for file close command to
> >> complete before proceeding?
> > 
> > The close method is defined and flushing and closing a file, so it
> > should not return until that's done.
> 
> But note that "done" in this case means "the file system thinks it is 
> done", not *actually* done. Hard drives, especially the cheaper ones, 
> lie. They can say the file is written when in fact the data is still in 
> the hard drive's internal cache and not written to the disk platter. 
> Also, in my experience, hardware RAID controllers will eat your data, and 
> then your brains when you try to diagnose the problem.
> 
> I would consider the chance that the disk may be faulty, or the file 
> system is corrupt. Does the problem go away if you write to a different 
> file system or a different disk?

It is *possible* that this is the problem, but it's really way far out 
on the long tail of possibilities.  If the file system were corrupted or 
the disk faulty, the odds are you would be seeing all sorts of other 
problems.  And this would not be anywhere near as repeatable as the OP 
is describing.

Think horses, not zebras.

[toc] | [prev] | [next] | [standalone]


#43466

From88888 Dihedral <dihedral88888@googlemail.com>
Date2013-04-12 08:49 -0700
Message-ID<423f86d9-445b-4578-a845-208d003a7587@googlegroups.com>
In reply to#43405
Steven D'Aprano於 2013年4月12日星期五UTC+8上午8時06分21秒寫道:
> On Thu, 11 Apr 2013 19:55:53 +0000, Neil Cerutti wrote:
> 
> 
> 
> > On 2013-04-11, Rob Schneider <rmschne@gmail.com> wrote:
> 
> >> Thanks. Yes, there is a close function call  before the copy is
> 
> >> launched. No other writes. Does Python wait for file close command to
> 
> >> complete before proceeding?
> 
> > 
> 
> > The close method is defined and flushing and closing a file, so it
> 
> > should not return until that's done.
> 
> 
> 
> But note that "done" in this case means "the file system thinks it is 
> 
> done", not *actually* done. Hard drives, especially the cheaper ones, 
> 
> lie. They can say the file is written when in fact the data is still in 
> 
> the hard drive's internal cache and not written to the disk platter. 
> 
> Also, in my experience, hardware RAID controllers will eat your data, and 
> 
> then your brains when you try to diagnose the problem.
> 
> 
Don't you model this as a non-blocking operation in 
your program?


> 
> I would consider the chance that the disk may be faulty, or the file 
> 
> system is corrupt. Does the problem go away if you write to a different 
> 
> file system or a different disk?
> 
> 
> 
> 
> 
> 
> 
> -- 
> 
> Steven


Back-ups and read-back verifications are important for 
those who care.

[toc] | [prev] | [next] | [standalone]


#43496

FromNobody <nobody@nowhere.com>
Date2013-04-13 03:33 +0100
Message-ID<pan.2013.04.13.02.33.28.725000@nowhere.com>
In reply to#43405
On Fri, 12 Apr 2013 00:06:21 +0000, Steven D'Aprano wrote:

>> The close method is defined and flushing and closing a file, so it
>> should not return until that's done.
> 
> But note that "done" in this case means "the file system thinks it is 
> done", not *actually* done. Hard drives, especially the cheaper ones, 
> lie. They can say the file is written when in fact the data is still in 
> the hard drive's internal cache and not written to the disk platter. 
> Also, in my experience, hardware RAID controllers will eat your data, and 
> then your brains when you try to diagnose the problem.

None of which is likely to be relevant here, as any subsequent access to
the file will reference the in-memory copy; the disk will only get
involved if the data has already been flushed from the OS' cache and has
to be read back in from disk.

write(), close(), etc return once the data has been written to the
OS' disk cache. At that point, the OS usually won't have even started
sending the data to the drive, let alone waited for the drive to report
(or claim) that the data has been written to the physical disk.

If you want to wait for the data written to be written to the physical
disk (in order to obtain specific behaviour with respect to an unclean
shutdown), use f.flush() followed by os.fsync(f.fileno()).

But most of the time, there's no point. If you actually care about what
happens in the event of an unclean shutdown, you typically also need to
sync the directory, otherwise the file's contents will get sync'd but the
file's very existence might not be.

[toc] | [prev] | [next] | [standalone]


#43497

FromChris Angelico <rosuav@gmail.com>
Date2013-04-13 13:05 +1000
Message-ID<mailman.543.1365822365.3114.python-list@python.org>
In reply to#43496
On Sat, Apr 13, 2013 at 12:33 PM, Nobody <nobody@nowhere.com> wrote:
> But most of the time, there's no point. If you actually care about what
> happens in the event of an unclean shutdown, you typically also need to
> sync the directory, otherwise the file's contents will get sync'd but the
> file's very existence might not be.

Or just store your content in a PostgreSQL database, and let it worry
about all the platform-specific details of how to fsync reliably.

ChrisA

[toc] | [prev] | [next] | [standalone]


#43498 — [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)]

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2013-04-13 03:17 +0000
Subject[OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)]
Message-ID<5168ce64$0$29977$c3e8da3$5496439d@news.astraweb.com>
In reply to#43496
On Sat, 13 Apr 2013 03:33:29 +0100, Nobody wrote:

> On Fri, 12 Apr 2013 00:06:21 +0000, Steven D'Aprano wrote:
> 
>>> The close method is defined and flushing and closing a file, so it
>>> should not return until that's done.
>> 
>> But note that "done" in this case means "the file system thinks it is
>> done", not *actually* done. Hard drives, especially the cheaper ones,
>> lie. They can say the file is written when in fact the data is still in
>> the hard drive's internal cache and not written to the disk platter.
>> Also, in my experience, hardware RAID controllers will eat your data,
>> and then your brains when you try to diagnose the problem.
> 
> None of which is likely to be relevant here, 

Since we've actually identified the bug (the OP was using file.close 
without actually calling it), that's certainly the case :-)


[...]
> If you want to wait for the data written to be written to the physical
> disk (in order to obtain specific behaviour with respect to an unclean
> shutdown), use f.flush() followed by os.fsync(f.fileno()).

If only it were that simple. It has been documented that some disks will 
lie, even when told to sync. When I say "some", I mean *most*. There's 
probably nothing you can do about it, apart from not using that model or 
brand of disk, so you have to just live with the risk.

http://queue.acm.org/detail.cfm?id=2367378

USB sticks are especially nasty. I've got quite a few USB thumb drives 
where the "write" light keeps flickering for anything up to five minutes 
after the OS reports that the drive has been unmounted and is safe to 
unplug. I corrupted the data on these quite a few times until I noticed 
the light. And let's not even mention the drives that have no light at 
all...

But my favourite example of lying hard drives of all time is this:

http://blog.jitbit.com/2011/04/chinese-magic-drive.html

I want one of those!



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#43499 — Re: [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)]

FromChris Angelico <rosuav@gmail.com>
Date2013-04-13 13:43 +1000
SubjectRe: [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)]
Message-ID<mailman.544.1365824646.3114.python-list@python.org>
In reply to#43498
On Sat, Apr 13, 2013 at 1:17 PM, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> wrote:
> On Sat, 13 Apr 2013 03:33:29 +0100, Nobody wrote:
>> If you want to wait for the data written to be written to the physical
>> disk (in order to obtain specific behaviour with respect to an unclean
>> shutdown), use f.flush() followed by os.fsync(f.fileno()).
>
> If only it were that simple. It has been documented that some disks will
> lie, even when told to sync. When I say "some", I mean *most*. There's
> probably nothing you can do about it, apart from not using that model or
> brand of disk, so you have to just live with the risk.

It's often close to that simple. With most hard disks, you can make
them 100% reliable, but you may have to check some disk parameters (on
Linux, that's just a matter of writing to something in /proc
somewhere, don't remember the details but it's easy to check). The
worst offenders I've met are SSDs...

> USB sticks are especially nasty. I've got quite a few USB thumb drives
> where the "write" light keeps flickering for anything up to five minutes
> after the OS reports that the drive has been unmounted and is safe to
> unplug. I corrupted the data on these quite a few times until I noticed
> the light. And let's not even mention the drives that have no light at
> all...

... but you've met worse.

> But my favourite example of lying hard drives of all time is this:
>
> http://blog.jitbit.com/2011/04/chinese-magic-drive.html
>
> I want one of those!

Awesome! It's the new version of DoubleSpace / DriveSpace!

http://en.wikipedia.org/wiki/DriveSpace

(And its problems, according to that Wikipedia article, actually had
the same root cause - write caching that the user wasn't aware of.
Great.)

ChrisA

[toc] | [prev] | [next] | [standalone]


#43422

FromRob Schneider <rmschne@gmail.com>
Date2013-04-11 23:25 -0700
Message-ID<d457256e-a195-4c14-ac7f-0afb6b45dc15@googlegroups.com>
In reply to#43402
> The close method is defined and flushing and closing a file, so
> 
> it should not return until that's done.
> 
> 
> 
> What command are you using to create the temp file?
> 
> 

re command to write the file: 
f=open(fn,'w')
... then create HTML text in a string
f.write(html)
f.close 

[toc] | [prev] | [next] | [standalone]


#43430

FromChris Angelico <rosuav@gmail.com>
Date2013-04-12 17:32 +1000
Message-ID<mailman.508.1365751943.3114.python-list@python.org>
In reply to#43422
On Fri, Apr 12, 2013 at 4:25 PM, Rob Schneider <rmschne@gmail.com> wrote:
>
>> The close method is defined and flushing and closing a file, so
>>
>> it should not return until that's done.
>>
>>
>>
>> What command are you using to create the temp file?
>>
>>
>
> re command to write the file:
> f=open(fn,'w')
> ... then create HTML text in a string
> f.write(html)
> f.close

Hold it one moment... You're not actually calling close. The file's
still open. Is that a copy/paste problem, or is that your actual code?

In Python, a function call ALWAYS has parentheses after it. Evaluating
a function's name like that returns the function (or method) object,
which you then do nothing with. (You could assign it someplace, for
instance, and call it later.) Try adding empty parens:

f.close()

and see if that solves the problem. Alternatively, look into the
'with' statement and the block syntax that it can give to I/O
operations.

ChrisA

[toc] | [prev] | [next] | [standalone]


#43457

FromTerry Jan Reedy <tjreedy@udel.edu>
Date2013-04-12 10:48 -0400
Message-ID<mailman.523.1365778206.3114.python-list@python.org>
In reply to#43422
On 4/12/2013 3:32 AM, Chris Angelico wrote:
> On Fri, Apr 12, 2013 at 4:25 PM, Rob Schneider <rmschne@gmail.com> wrote:
>>
>>> The close method is defined and flushing and closing a file, so
>>>
>>> it should not return until that's done.
>>>
>>>
>>>
>>> What command are you using to create the temp file?
>>>
>>>
>>
>> re command to write the file:
>> f=open(fn,'w')
>> ... then create HTML text in a string
>> f.write(html)
>> f.close
>
> Hold it one moment... You're not actually calling close. The file's
> still open. Is that a copy/paste problem, or is that your actual code?
>
> In Python, a function call ALWAYS has parentheses after it. Evaluating
> a function's name like that returns the function (or method) object,
> which you then do nothing with. (You could assign it someplace, for
> instance, and call it later.) Try adding empty parens:
>
> f.close()
>
> and see if that solves the problem. Alternatively, look into the
> 'with' statement and the block syntax that it can give to I/O
> operations.

I say *definitely* use a 'with' statement. Part of its purpose is to 
avoid close bugs.

[toc] | [prev] | [standalone]


Page 2 of 2 — ← Prev page 1 [2]

Back to top | Article view | comp.lang.python


csiph-web