Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #43394 > unrolled thread
| Started by | Rob Schneider <rmschne@gmail.com> |
|---|---|
| First post | 2013-04-11 11:12 -0700 |
| Last post | 2013-04-12 10:48 -0400 |
| Articles | 12 on this page of 32 — 11 participants |
Back to article view | Back to comp.lang.python
shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 11:12 -0700
Re: shutil.copyfile is incomplete (truncated) Neil Cerutti <neilc@norwich.edu> - 2013-04-11 18:53 +0000
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 12:07 -0700
Re: shutil.copyfile is incomplete (truncated) Neil Cerutti <neilc@norwich.edu> - 2013-04-11 19:55 +0000
Re: shutil.copyfile is incomplete (truncated) Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-12 00:06 +0000
Re: shutil.copyfile is incomplete (truncated) Cameron Simpson <cs@zip.com.au> - 2013-04-12 11:15 +1000
Re: shutil.copyfile is incomplete (truncated) Ned Deily <nad@acm.org> - 2013-04-11 18:33 -0700
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:32 -0700
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:53 -0700
Re: shutil.copyfile is incomplete (truncated) Ned Deily <nad@acm.org> - 2013-04-12 00:53 -0700
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:53 -0700
Re: shutil.copyfile is incomplete (truncated) Cameron Simpson <cs@zip.com.au> - 2013-04-12 18:26 +1000
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 02:18 -0700
Re: shutil.copyfile is incomplete (truncated) Chris Angelico <rosuav@gmail.com> - 2013-04-12 19:22 +1000
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 05:07 -0700
Re: shutil.copyfile is incomplete (truncated) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2013-04-12 13:18 +0100
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 05:07 -0700
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-12 02:18 -0700
Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:57 -0400
Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:54 -0400
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:32 -0700
Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:47 -0400
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:27 -0700
Re: shutil.copyfile is incomplete (truncated) Roy Smith <roy@panix.com> - 2013-04-12 10:51 -0400
Re: shutil.copyfile is incomplete (truncated) 88888 Dihedral <dihedral88888@googlemail.com> - 2013-04-12 08:49 -0700
Re: shutil.copyfile is incomplete (truncated) Nobody <nobody@nowhere.com> - 2013-04-13 03:33 +0100
Re: shutil.copyfile is incomplete (truncated) Chris Angelico <rosuav@gmail.com> - 2013-04-13 13:05 +1000
[OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)] Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2013-04-13 03:17 +0000
Re: [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)] Chris Angelico <rosuav@gmail.com> - 2013-04-13 13:43 +1000
Re: shutil.copyfile is incomplete (truncated) Rob Schneider <rmschne@gmail.com> - 2013-04-11 23:25 -0700
Re: shutil.copyfile is incomplete (truncated) Chris Angelico <rosuav@gmail.com> - 2013-04-12 17:32 +1000
Re: shutil.copyfile is incomplete (truncated) Terry Jan Reedy <tjreedy@udel.edu> - 2013-04-12 10:48 -0400
Page 2 of 2 — ← Prev page 1 [2]
| From | Rob Schneider <rmschne@gmail.com> |
|---|---|
| Date | 2013-04-11 23:32 -0700 |
| Message-ID | <mailman.506.1365751267.3114.python-list@python.org> |
| In reply to | #43412 |
> > > Or that the filesystem may be full? Of course, that's usually obvious > > > more widely when it happens... > > > > > > Question: is the size of the incomplete file a round number? (Like > > > a multiple of a decent sized power of 2>) > > > > Also on what OS X file system type does the file being created reside, > > in particular, is it a network file system? > File system not full (2/3 of disk is free) Source (correct one) is 47,970 bytes. Target after copy of 45,056 bytes. I've tried changing what gets written to change the file size. It is usually this sort of difference. The file system is Mac OS Extended Journaled (default as out of the box).
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-04-12 10:47 -0400 |
| Message-ID | <roy-0DAB01.10473112042013@news.panix.com> |
| In reply to | #43428 |
In article <mailman.506.1365751267.3114.python-list@python.org>, Rob Schneider <rmschne@gmail.com> wrote: > Source (correct one) is 47,970 bytes. Target after copy of 45,056 bytes. > I've tried changing what gets written to change the file size. It is usually > this sort of difference. > > The file system is Mac OS Extended Journaled (default as out of the box). Is it always the tail end of the file that gets truncated, or is it missing (or mutating) data in the middle of the file? I'm just grasping at straws here, but maybe it's somehow messing up line endings (turning CRLF pairs into just LF), or using some other kind of encoding for unicode characters? If you compare the files with cmp, does it say: $ cmp original truncated cmp: EOF on truncated that's what I would expect if it's a strict truncation. If it says anything else, you've got a data munging problem. What I would normally do around this time is run a system call trace on the process to watch all the descriptor related (i.e. open, create, write) system calls. On OSX, that means dtruss. Unfortunately, I'm not that familiar with the OSX variant so I can't give you specific advice about which options to use. When you can see the system calls, you know exactly what your process is doing. You should be able to see the output file being opened and a descriptor returned, then find all the write() calls to that descriptor. You'll also be able to find any other system calls on that pathname after the descriptor is closed. Please report back what you find! Oh, another trick you might want to try is making the output file path /dev/stdout and redirecting the output into a file with the shell. See if that makes any difference. Or, try something like (assuming the -o option to your script sets the output filename): python my_prog.py -o /dev/stdout | dd bs=1 of=xxx That will do a couple of things. First, dd will report how many bytes it read and wrote, so you can see if that's the correct number. Also, since your process will no longer be writing to a real file, if anything is doing something weird like a seek() after you're done writing, that will fail since you can't seek() on a pipe.
[toc] | [prev] | [next] | [standalone]
| From | Rob Schneider <rmschne@gmail.com> |
|---|---|
| Date | 2013-04-11 23:27 -0700 |
| Message-ID | <e088d429-586d-4f90-88b9-4f73a0419280@googlegroups.com> |
| In reply to | #43405 |
> I would consider the chance that the disk may be faulty, or the file > > system is corrupt. Does the problem go away if you write to a different > > file system or a different disk? > It's a relatively new MacBook Pro with a solid state disk. I've not noticed any other disk problems. I did a "repair permissions" (for what it's worth). Maybe I'll have it tested at the Genius Bar. I don't have the full system on another computer to try that; but will work on that today.
[toc] | [prev] | [next] | [standalone]
| From | Roy Smith <roy@panix.com> |
|---|---|
| Date | 2013-04-12 10:51 -0400 |
| Message-ID | <roy-81341B.10510012042013@news.panix.com> |
| In reply to | #43405 |
In article <51674ffc$0$29977$c3e8da3$5496439d@news.astraweb.com>, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Thu, 11 Apr 2013 19:55:53 +0000, Neil Cerutti wrote: > > > On 2013-04-11, Rob Schneider <rmschne@gmail.com> wrote: > >> Thanks. Yes, there is a close function call before the copy is > >> launched. No other writes. Does Python wait for file close command to > >> complete before proceeding? > > > > The close method is defined and flushing and closing a file, so it > > should not return until that's done. > > But note that "done" in this case means "the file system thinks it is > done", not *actually* done. Hard drives, especially the cheaper ones, > lie. They can say the file is written when in fact the data is still in > the hard drive's internal cache and not written to the disk platter. > Also, in my experience, hardware RAID controllers will eat your data, and > then your brains when you try to diagnose the problem. > > I would consider the chance that the disk may be faulty, or the file > system is corrupt. Does the problem go away if you write to a different > file system or a different disk? It is *possible* that this is the problem, but it's really way far out on the long tail of possibilities. If the file system were corrupted or the disk faulty, the odds are you would be seeing all sorts of other problems. And this would not be anywhere near as repeatable as the OP is describing. Think horses, not zebras.
[toc] | [prev] | [next] | [standalone]
| From | 88888 Dihedral <dihedral88888@googlemail.com> |
|---|---|
| Date | 2013-04-12 08:49 -0700 |
| Message-ID | <423f86d9-445b-4578-a845-208d003a7587@googlegroups.com> |
| In reply to | #43405 |
Steven D'Aprano於 2013年4月12日星期五UTC+8上午8時06分21秒寫道: > On Thu, 11 Apr 2013 19:55:53 +0000, Neil Cerutti wrote: > > > > > On 2013-04-11, Rob Schneider <rmschne@gmail.com> wrote: > > >> Thanks. Yes, there is a close function call before the copy is > > >> launched. No other writes. Does Python wait for file close command to > > >> complete before proceeding? > > > > > > The close method is defined and flushing and closing a file, so it > > > should not return until that's done. > > > > But note that "done" in this case means "the file system thinks it is > > done", not *actually* done. Hard drives, especially the cheaper ones, > > lie. They can say the file is written when in fact the data is still in > > the hard drive's internal cache and not written to the disk platter. > > Also, in my experience, hardware RAID controllers will eat your data, and > > then your brains when you try to diagnose the problem. > > Don't you model this as a non-blocking operation in your program? > > I would consider the chance that the disk may be faulty, or the file > > system is corrupt. Does the problem go away if you write to a different > > file system or a different disk? > > > > > > > > -- > > Steven Back-ups and read-back verifications are important for those who care.
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.com> |
|---|---|
| Date | 2013-04-13 03:33 +0100 |
| Message-ID | <pan.2013.04.13.02.33.28.725000@nowhere.com> |
| In reply to | #43405 |
On Fri, 12 Apr 2013 00:06:21 +0000, Steven D'Aprano wrote: >> The close method is defined and flushing and closing a file, so it >> should not return until that's done. > > But note that "done" in this case means "the file system thinks it is > done", not *actually* done. Hard drives, especially the cheaper ones, > lie. They can say the file is written when in fact the data is still in > the hard drive's internal cache and not written to the disk platter. > Also, in my experience, hardware RAID controllers will eat your data, and > then your brains when you try to diagnose the problem. None of which is likely to be relevant here, as any subsequent access to the file will reference the in-memory copy; the disk will only get involved if the data has already been flushed from the OS' cache and has to be read back in from disk. write(), close(), etc return once the data has been written to the OS' disk cache. At that point, the OS usually won't have even started sending the data to the drive, let alone waited for the drive to report (or claim) that the data has been written to the physical disk. If you want to wait for the data written to be written to the physical disk (in order to obtain specific behaviour with respect to an unclean shutdown), use f.flush() followed by os.fsync(f.fileno()). But most of the time, there's no point. If you actually care about what happens in the event of an unclean shutdown, you typically also need to sync the directory, otherwise the file's contents will get sync'd but the file's very existence might not be.
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-04-13 13:05 +1000 |
| Message-ID | <mailman.543.1365822365.3114.python-list@python.org> |
| In reply to | #43496 |
On Sat, Apr 13, 2013 at 12:33 PM, Nobody <nobody@nowhere.com> wrote: > But most of the time, there's no point. If you actually care about what > happens in the event of an unclean shutdown, you typically also need to > sync the directory, otherwise the file's contents will get sync'd but the > file's very existence might not be. Or just store your content in a PostgreSQL database, and let it worry about all the platform-specific details of how to fsync reliably. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2013-04-13 03:17 +0000 |
| Subject | [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)] |
| Message-ID | <5168ce64$0$29977$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #43496 |
On Sat, 13 Apr 2013 03:33:29 +0100, Nobody wrote: > On Fri, 12 Apr 2013 00:06:21 +0000, Steven D'Aprano wrote: > >>> The close method is defined and flushing and closing a file, so it >>> should not return until that's done. >> >> But note that "done" in this case means "the file system thinks it is >> done", not *actually* done. Hard drives, especially the cheaper ones, >> lie. They can say the file is written when in fact the data is still in >> the hard drive's internal cache and not written to the disk platter. >> Also, in my experience, hardware RAID controllers will eat your data, >> and then your brains when you try to diagnose the problem. > > None of which is likely to be relevant here, Since we've actually identified the bug (the OP was using file.close without actually calling it), that's certainly the case :-) [...] > If you want to wait for the data written to be written to the physical > disk (in order to obtain specific behaviour with respect to an unclean > shutdown), use f.flush() followed by os.fsync(f.fileno()). If only it were that simple. It has been documented that some disks will lie, even when told to sync. When I say "some", I mean *most*. There's probably nothing you can do about it, apart from not using that model or brand of disk, so you have to just live with the risk. http://queue.acm.org/detail.cfm?id=2367378 USB sticks are especially nasty. I've got quite a few USB thumb drives where the "write" light keeps flickering for anything up to five minutes after the OS reports that the drive has been unmounted and is safe to unplug. I corrupted the data on these quite a few times until I noticed the light. And let's not even mention the drives that have no light at all... But my favourite example of lying hard drives of all time is this: http://blog.jitbit.com/2011/04/chinese-magic-drive.html I want one of those! -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-04-13 13:43 +1000 |
| Subject | Re: [OT] Lying hard drives [was Re: shutil.copyfile is incomplete (truncated)] |
| Message-ID | <mailman.544.1365824646.3114.python-list@python.org> |
| In reply to | #43498 |
On Sat, Apr 13, 2013 at 1:17 PM, Steven D'Aprano <steve+comp.lang.python@pearwood.info> wrote: > On Sat, 13 Apr 2013 03:33:29 +0100, Nobody wrote: >> If you want to wait for the data written to be written to the physical >> disk (in order to obtain specific behaviour with respect to an unclean >> shutdown), use f.flush() followed by os.fsync(f.fileno()). > > If only it were that simple. It has been documented that some disks will > lie, even when told to sync. When I say "some", I mean *most*. There's > probably nothing you can do about it, apart from not using that model or > brand of disk, so you have to just live with the risk. It's often close to that simple. With most hard disks, you can make them 100% reliable, but you may have to check some disk parameters (on Linux, that's just a matter of writing to something in /proc somewhere, don't remember the details but it's easy to check). The worst offenders I've met are SSDs... > USB sticks are especially nasty. I've got quite a few USB thumb drives > where the "write" light keeps flickering for anything up to five minutes > after the OS reports that the drive has been unmounted and is safe to > unplug. I corrupted the data on these quite a few times until I noticed > the light. And let's not even mention the drives that have no light at > all... ... but you've met worse. > But my favourite example of lying hard drives of all time is this: > > http://blog.jitbit.com/2011/04/chinese-magic-drive.html > > I want one of those! Awesome! It's the new version of DoubleSpace / DriveSpace! http://en.wikipedia.org/wiki/DriveSpace (And its problems, according to that Wikipedia article, actually had the same root cause - write caching that the user wasn't aware of. Great.) ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Rob Schneider <rmschne@gmail.com> |
|---|---|
| Date | 2013-04-11 23:25 -0700 |
| Message-ID | <d457256e-a195-4c14-ac7f-0afb6b45dc15@googlegroups.com> |
| In reply to | #43402 |
> The close method is defined and flushing and closing a file, so > > it should not return until that's done. > > > > What command are you using to create the temp file? > > re command to write the file: f=open(fn,'w') ... then create HTML text in a string f.write(html) f.close
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2013-04-12 17:32 +1000 |
| Message-ID | <mailman.508.1365751943.3114.python-list@python.org> |
| In reply to | #43422 |
On Fri, Apr 12, 2013 at 4:25 PM, Rob Schneider <rmschne@gmail.com> wrote: > >> The close method is defined and flushing and closing a file, so >> >> it should not return until that's done. >> >> >> >> What command are you using to create the temp file? >> >> > > re command to write the file: > f=open(fn,'w') > ... then create HTML text in a string > f.write(html) > f.close Hold it one moment... You're not actually calling close. The file's still open. Is that a copy/paste problem, or is that your actual code? In Python, a function call ALWAYS has parentheses after it. Evaluating a function's name like that returns the function (or method) object, which you then do nothing with. (You could assign it someplace, for instance, and call it later.) Try adding empty parens: f.close() and see if that solves the problem. Alternatively, look into the 'with' statement and the block syntax that it can give to I/O operations. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Terry Jan Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2013-04-12 10:48 -0400 |
| Message-ID | <mailman.523.1365778206.3114.python-list@python.org> |
| In reply to | #43422 |
On 4/12/2013 3:32 AM, Chris Angelico wrote: > On Fri, Apr 12, 2013 at 4:25 PM, Rob Schneider <rmschne@gmail.com> wrote: >> >>> The close method is defined and flushing and closing a file, so >>> >>> it should not return until that's done. >>> >>> >>> >>> What command are you using to create the temp file? >>> >>> >> >> re command to write the file: >> f=open(fn,'w') >> ... then create HTML text in a string >> f.write(html) >> f.close > > Hold it one moment... You're not actually calling close. The file's > still open. Is that a copy/paste problem, or is that your actual code? > > In Python, a function call ALWAYS has parentheses after it. Evaluating > a function's name like that returns the function (or method) object, > which you then do nothing with. (You could assign it someplace, for > instance, and call it later.) Try adding empty parens: > > f.close() > > and see if that solves the problem. Alternatively, look into the > 'with' statement and the block syntax that it can give to I/O > operations. I say *definitely* use a 'with' statement. Part of its purpose is to avoid close bugs.
[toc] | [prev] | [standalone]
Page 2 of 2 — ← Prev page 1 [2]
Back to top | Article view | comp.lang.python
csiph-web