Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #68276 > unrolled thread
| Started by | zoom <zoom@yahoo.com> |
|---|---|
| First post | 2014-03-12 13:29 +0100 |
| Last post | 2014-03-13 11:22 +1100 |
| Articles | 8 — 8 participants |
Back to article view | Back to comp.lang.python
Save to a file, but avoid overwriting an existing file zoom <zoom@yahoo.com> - 2014-03-12 13:29 +0100
Re: Save to a file, but avoid overwriting an existing file Skip Montanaro <skip@pobox.com> - 2014-03-12 07:37 -0500
Re: Save to a file, but avoid overwriting an existing file Tim Chase <python.list@tim.thechases.com> - 2014-03-12 08:33 -0500
Re:Save to a file, but avoid overwriting an existing file Dave Angel <davea@davea.name> - 2014-03-12 14:22 -0400
Re: Save to a file, but avoid overwriting an existing file Emile van Sebille <emile@fenx.com> - 2014-03-12 12:38 -0700
Re: Save to a file, but avoid overwriting an existing file Cameron Simpson <cs@zip.com.au> - 2014-03-13 09:19 +1100
Re: Save to a file, but avoid overwriting an existing file Mark Lawrence <breamoreboy@yahoo.co.uk> - 2014-03-12 23:04 +0000
Re: Save to a file, but avoid overwriting an existing file Ben Finney <ben+python@benfinney.id.au> - 2014-03-13 11:22 +1100
| From | zoom <zoom@yahoo.com> |
|---|---|
| Date | 2014-03-12 13:29 +0100 |
| Subject | Save to a file, but avoid overwriting an existing file |
| Message-ID | <lfpjv9$tki$1@news1.carnet.hr> |
Hi!
I would like to assure that when writing to a file I do not overwrite an
existing file, but I'm unsure which is the best way to approach to this
problem. As I can see, there are at least two possibilities:
1. I could use fd = os.open("x", os.O_WRONLY | os.O_CREAT | os.O_EXCL)
which will fail - if the file exists. However, I would prefer if the
program would try to save under different name in this case, instead of
discarding all the calculation done until now - but I' not too well with
catching exceptions.
2. Alternatively, a unique string could be generated to assure that no
same file exists. I can see one approach to this is to include date and
time in the file name. But this seems to me a bit clumsy, and is not
unique, i.e. it could happen (at least in theory) that two processes
finish in the same second.
Any suggestions, please?
[toc] | [next] | [standalone]
| From | Skip Montanaro <skip@pobox.com> |
|---|---|
| Date | 2014-03-12 07:37 -0500 |
| Message-ID | <mailman.8086.1394627874.18130.python-list@python.org> |
| In reply to | #68276 |
This seems to be an application-level decision. If so, in your
application, why not just check to see if the file exists, and
implement whatever workaround you deem correct for your needs? For
example (to choose a simple, but rather silly, file naming strategy):
fname = "x"
while os.path.exists(fname):
fname = "%s.%f" % (fname, random.random())
fd = open(fname, "w")
It's clearly not going to be safe from race conditions, but I leave
solving that problem as an exercise for the reader.
Skip
[toc] | [prev] | [next] | [standalone]
| From | Tim Chase <python.list@tim.thechases.com> |
|---|---|
| Date | 2014-03-12 08:33 -0500 |
| Message-ID | <mailman.8089.1394631173.18130.python-list@python.org> |
| In reply to | #68276 |
On 2014-03-12 13:29, zoom wrote: > 2. Alternatively, a unique string could be generated to assure that > no same file exists. I can see one approach to this is to include > date and time in the file name. But this seems to me a bit clumsy, > and is not unique, i.e. it could happen (at least in theory) that > two processes finish in the same second. Python offers a "tempfile" module that gives this (and a whole lot more) to you out of the box. -tkc
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <davea@davea.name> |
|---|---|
| Date | 2014-03-12 14:22 -0400 |
| Message-ID | <mailman.8097.1394648316.18130.python-list@python.org> |
| In reply to | #68276 |
zoom <zoom@yahoo.com> Wrote in message:
> Hi!
>
> I would like to assure that when writing to a file I do not overwrite an
> existing file, but I'm unsure which is the best way to approach to this
> problem. As I can see, there are at least two possibilities:
>
> 1. I could use fd = os.open("x", os.O_WRONLY | os.O_CREAT | os.O_EXCL)
> which will fail - if the file exists. However, I would prefer if the
> program would try to save under different name in this case, instead of
> discarding all the calculation done until now - but I' not too well with
> catching exceptions.
>
The tempfile module is your best answer, but if you really need
to keep the file afterwards, you'll have the same problem when
you rename it later.
I suggest you learn about try/except. For simple cases it's not
that tough, though if you want to ask about it, you'll need to
specify your Python version.
--
DaveA
[toc] | [prev] | [next] | [standalone]
| From | Emile van Sebille <emile@fenx.com> |
|---|---|
| Date | 2014-03-12 12:38 -0700 |
| Message-ID | <mailman.8099.1394653162.18130.python-list@python.org> |
| In reply to | #68276 |
On 3/12/2014 5:29 AM, zoom wrote: > 2. Alternatively, a unique string could be generated to assure that no > same file exists. I can see one approach to this is to include date and > time in the file name. But this seems to me a bit clumsy, and is not > unique, i.e. it could happen (at least in theory) that two processes > finish in the same second. I tend to use this method -- prepending the job name or targeting different directories per job precludes duplication. Unless you're running the same job at the same time, in which case tempfile is the way to go (which I use for archiving spooled print files which can occur simultaneously.) Emile
[toc] | [prev] | [next] | [standalone]
| From | Cameron Simpson <cs@zip.com.au> |
|---|---|
| Date | 2014-03-13 09:19 +1100 |
| Message-ID | <mailman.8109.1394664190.18130.python-list@python.org> |
| In reply to | #68276 |
On 12Mar2014 13:29, zoom <zoom@yahoo.com> wrote:
> I would like to assure that when writing to a file I do not
> overwrite an existing file, but I'm unsure which is the best way to
> approach to this problem. As I can see, there are at least two
> possibilities:
>
> 1. I could use fd = os.open("x", os.O_WRONLY | os.O_CREAT | os.O_EXCL)
> which will fail - if the file exists. However, I would prefer if the
> program would try to save under different name in this case, instead
> of discarding all the calculation done until now - but I' not too
> well with catching exceptions.
Others have menthions tempfile, though of course you have the same collision
issue when you come to rename the temp file if you are keeping it.
I would run with option 1 for your task.
Just iterate until os.open succeeds.
However, you need to distinuish _why_ an open fails. For example,
if you were trying to make files in a directory to which you do not
have write permission, or just a directory that did not exist,
os.open would fail not matter what name you used, so your loop would
run forever.
Therefore you need to continue _only_ if you get EEXIST. Otherwise abort.
So you'd have some code like this (totally untested):
# at top of script
import errno
# where you make the file
def open_new(primary_name):
try:
fd = os.open(primary_name, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
except OSError as e:
if e.errno != errno.EEXIST:
raise
else:
return primary_name, fd
n = 1
while True:
secondary_name = "%s.%d" % (primary_name, n)
try:
fd = os.open(secondary_name, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
except OSError as e:
if e.errno != errno.EEXIST:
raise
else:
return secondary_name, fd
n += 1
# where you need the file
path, fd = open_new("x")
That gets you a function your can reuse which returns the file's
name and the file descriptor.
Cheers,
--
Cameron Simpson <cs@zip.com.au>
Reason #173 to fear technology:
o o o o o o <o <o>
^|\ ^|^ v|^ v|v |/v |X| \| |
/\ >\ /< >\ /< >\ /< >\
o> o o o o o o o
\ x </ <|> </> <\> <)> |\
/< >\ /< >\ /< >\ >> L
Mr. email does the Macarena.
[toc] | [prev] | [next] | [standalone]
| From | Mark Lawrence <breamoreboy@yahoo.co.uk> |
|---|---|
| Date | 2014-03-12 23:04 +0000 |
| Message-ID | <mailman.8111.1394665486.18130.python-list@python.org> |
| In reply to | #68276 |
On 12/03/2014 22:19, Cameron Simpson wrote:
> On 12Mar2014 13:29, zoom <zoom@yahoo.com> wrote:
>> I would like to assure that when writing to a file I do not
>> overwrite an existing file, but I'm unsure which is the best way to
>> approach to this problem. As I can see, there are at least two
>> possibilities:
>>
>> 1. I could use fd = os.open("x", os.O_WRONLY | os.O_CREAT | os.O_EXCL)
>> which will fail - if the file exists. However, I would prefer if the
>> program would try to save under different name in this case, instead
>> of discarding all the calculation done until now - but I' not too
>> well with catching exceptions.
>
> Others have menthions tempfile, though of course you have the same collision
> issue when you come to rename the temp file if you are keeping it.
>
> I would run with option 1 for your task.
>
> Just iterate until os.open succeeds.
>
> However, you need to distinuish _why_ an open fails. For example,
> if you were trying to make files in a directory to which you do not
> have write permission, or just a directory that did not exist,
> os.open would fail not matter what name you used, so your loop would
> run forever.
>
> Therefore you need to continue _only_ if you get EEXIST. Otherwise abort.
>
> So you'd have some code like this (totally untested):
>
> # at top of script
> import errno
>
> # where you make the file
> def open_new(primary_name):
> try:
> fd = os.open(primary_name, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
> except OSError as e:
> if e.errno != errno.EEXIST:
> raise
> else:
> return primary_name, fd
> n = 1
> while True:
> secondary_name = "%s.%d" % (primary_name, n)
> try:
> fd = os.open(secondary_name, os.O_WRONLY | os.O_CREAT | os.O_EXCL)
> except OSError as e:
> if e.errno != errno.EEXIST:
> raise
> else:
> return secondary_name, fd
> n += 1
>
> # where you need the file
> path, fd = open_new("x")
>
> That gets you a function your can reuse which returns the file's
> name and the file descriptor.
>
> Cheers,
>
I haven't looked but would things be easier if the new exception
hierarchy were used
http://docs.python.org/3.3/whatsnew/3.3.html#pep-3151-reworking-the-os-and-io-exception-hierarchy
?
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2014-03-13 11:22 +1100 |
| Message-ID | <mailman.8115.1394670188.18130.python-list@python.org> |
| In reply to | #68276 |
Cameron Simpson <cs@zip.com.au> writes: > Therefore you need to continue _only_ if you get EEXIST. Otherwise > abort. If you target Python 3.3 or later, you can catch “FileExistsError” <URL:http://docs.python.org/3/library/exceptions.html#FileExistsError> which is far simpler than messing around with ‘errno’ values. -- \ “I know you believe you understood what you think I said, but I | `\ am not sure you realize that what you heard is not what I | _o__) meant.” —Robert J. McCloskey | Ben Finney
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web