Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #102925 > unrolled thread
| Started by | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| First post | 2016-02-15 11:08 +1100 |
| Last post | 2016-02-14 20:48 -0800 |
| Articles | 14 — 9 participants |
Back to article view | Back to comp.lang.python
This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by
below is the oldest one visible, not the original post.
Re: Make a unique filesystem path, without creating the file Ben Finney <ben+python@benfinney.id.au> - 2016-02-15 11:08 +1100
Re: Make a unique filesystem path, without creating the file Dan Sommers <dan@tombstonezero.net> - 2016-02-15 01:07 +0000
Re: Make a unique filesystem path, without creating the file Ben Finney <ben+python@benfinney.id.au> - 2016-02-15 12:19 +1100
Re: Make a unique filesystem path, without creating the file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-15 15:54 +1100
Re: Make a unique filesystem path, without creating the file Ben Finney <ben+python@benfinney.id.au> - 2016-02-15 16:25 +1100
Re: Make a unique filesystem path, without creating the file Rick Johnson <rantingrickjohnson@gmail.com> - 2016-02-15 18:26 -0800
Re: Make a unique filesystem path, without creating the file Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2016-02-15 21:00 +1300
Re: Make a unique filesystem path, without creating the file Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2016-02-16 01:18 +0100
Re: Make a unique filesystem path, without creating the file Grant Edwards <invalid@invalid.invalid> - 2016-02-15 15:49 +0000
Re: Make a unique filesystem path, without creating the file Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-02-15 15:06 +1100
Re: Make a unique filesystem path, without creating the file Ben Finney <ben+python@benfinney.id.au> - 2016-02-15 15:28 +1100
Re: Make a unique filesystem path, without creating the file Gregory Ewing <greg.ewing@canterbury.ac.nz> - 2016-02-15 21:11 +1300
Re: Make a unique filesystem path, without creating the file Nobody <nobody@nowhere.invalid> - 2016-02-16 02:14 +0000
Re: Make a unique filesystem path, without creating the file "Martin A. Brown" <martin@linux-ip.net> - 2016-02-14 20:48 -0800
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-02-15 11:08 +1100 |
| Subject | Re: Make a unique filesystem path, without creating the file |
| Message-ID | <mailman.121.1455494940.22075.python-list@python.org> |
Matt Wheeler <m@funkyhat.org> writes: > On 14 Feb 2016 21:46, "Ben Finney" <ben+python@benfinney.id.au> wrote: > > What standard library function should I be using to generate > > ‘tempfile.mktemp’-like unique paths, and *not* ever create a real > > file by that path? > > Could you use tempfile.TemporaryDirectory and then just use a > consistent name within that directory. That fails because it touches the filesystem. I want to avoid using a real file or a real directory. > It's guaranteed not to exist I am unconcerned with whether there is a real filesystem entry of that name; the goal entails having no filesystem activity for this. I want a valid unique filesystem path, without touching the filesystem. -- \ “I believe our future depends powerfully on how well we | `\ understand this cosmos, in which we float like a mote of dust | _o__) in the morning sky.” —Carl Sagan, _Cosmos_, 1980 | Ben Finney
[toc] | [next] | [standalone]
| From | Dan Sommers <dan@tombstonezero.net> |
|---|---|
| Date | 2016-02-15 01:07 +0000 |
| Message-ID | <n9r8bk$evf$1@dont-email.me> |
| In reply to | #102925 |
On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote: > I am unconcerned with whether there is a real filesystem entry of that > name; the goal entails having no filesystem activity for this. I want > a valid unique filesystem path, without touching the filesystem. That's an odd use case. If it's really just one valid filesystem path (your original post said *paths*, plural), then how about __file__? or os.__file__?
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-02-15 12:19 +1100 |
| Message-ID | <mailman.126.1455499198.22075.python-list@python.org> |
| In reply to | #102931 |
Dan Sommers <dan@tombstonezero.net> writes: > On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote: > > > I am unconcerned with whether there is a real filesystem entry of > > that name; the goal entails having no filesystem activity for this. > > I want a valid unique filesystem path, without touching the > > filesystem. > > That's an odd use case. It's very common to want filesystem paths divorced from accessing a filesystem entry. For example: test paths in a unit test. Filesystem access is orders of magnitude slower than accessing fake files in memory only, it is more complex and prone to irrelevant failures. So in such a test case filesystem access should be avoided as unnecessary. > If it's really just one valid filesystem path (your original post said > *paths*, plural), then how about __file__? or os.__file__? One valid filesystem path each time it's accessed. That is, behaviour equivalent to ‘tempfile.mktemp’. My question is because the standard library clearly has this useful functionality implemented, but simultaneously warns strongly against its use. I'm looking for how to get at that functionality in a non-deprecated way, without re-implementing it myself. -- \ “The most common way people give up their power is by thinking | `\ they don't have any.” —Alice Walker | _o__) | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2016-02-15 15:54 +1100 |
| Message-ID | <56c15a25$0$1622$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #102932 |
On Monday 15 February 2016 12:19, Ben Finney wrote: > One valid filesystem path each time it's accessed. That is, behaviour > equivalent to ‘tempfile.mktemp’. > > My question is because the standard library clearly has this useful > functionality implemented, but simultaneously warns strongly against its > use. If you can absolutely guarantee that this string will never actually be used on a real filesystem, then go right ahead and use it. There's nothing wrong with (for instance) calling mktemp to generate *strings* that merely *look* like pathnames. If you want to guarantee that these faux pathnames can't leak out of your test suite and touch the file system, prepend an ASCII NUL to them. That will make it an illegal path on all file systems that I'm aware of. > I'm looking for how to get at that functionality in a non-deprecated > way, without re-implementing it myself. You probably can't, not if you want to future-proof your code against the day when tempfile.mktemp is removed. But you can simply fork that module, delete all the irrelevant bits, and make the mktemp function a private utility in your test suite. -- Steve
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-02-15 16:25 +1100 |
| Message-ID | <mailman.129.1455513940.22075.python-list@python.org> |
| In reply to | #102938 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes: > If you can absolutely guarantee that this string will never actually > be used on a real filesystem, then go right ahead and use it. I'm giving advice in examples in documentation. It's not enough to have some private usage that I know is good, I am looking for a standard API that when the reader looks it up will not be laden with big scary warnings. Currently I can write about the public API ‘tempfile.mktemp’ in documentation, but the conscientious reader will be correct to have concerns when the examples I give are sternly deprecated in the standard library documentation. Or I can write about the private API ‘tempfile._RandomNameSequence’ in the documentation, and the conscientious reader will be correct to have concerns about use of an undocumented private-use API. I'm looking for a way to give examples that use that standard library functionality, with an API that is both public and not discouraged. > > I'm looking for how to get at that functionality in a non-deprecated > > way, without re-implementing it myself. > > You probably can't, not if you want to future-proof your code against > the day when tempfile.mktemp is removed. That's disappointing. It is already implemented and well-tested, it is useful as is. Forking and duplicating it is poor practice if it can simply be used in a standard place. I have reported <URL:https://bugs.python.org/issue26362> for this request. -- \ “Nothing worth saying is inoffensive to everyone. Nothing worth | `\ saying will fail to make you enemies. And nothing worth saying | _o__) will not produce a confrontation.” —Johann Hari, 2011 | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Rick Johnson <rantingrickjohnson@gmail.com> |
|---|---|
| Date | 2016-02-15 18:26 -0800 |
| Message-ID | <e60f2e38-8ab2-45ce-ab36-d76f11cb5a80@googlegroups.com> |
| In reply to | #102938 |
On Sunday, February 14, 2016 at 10:55:11 PM UTC-6, Steven D'Aprano wrote: > If you want to guarantee that these faux pathnames can't > leak out of your test suite and touch the file system, > prepend an ASCII NUL to them. That will make it an illegal > path on all file systems that I'm aware of. Hmm, the unfounded fears in this thread are beginning to remind me of a famous Black Sabbath song. Finished with "py tempfile", 'cause it, couldn't help to, ease my mind. People think i'm insane, because, i want "faux paths", all the time. All day long i think of ways, but nothing seems to, satisfy. Think i'll loose my mind, if i don't, find a py-module to, pacify. CAN YOU HELP ME? MAKE "FAUX PATHS" TODAAAAY, OH YEAH...
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2016-02-15 21:00 +1300 |
| Message-ID | <didet7Ft04gU1@mid.individual.net> |
| In reply to | #102932 |
Ben Finney wrote: > One valid filesystem path each time it's accessed. That is, behaviour > equivalent to ‘tempfile.mktemp’. > > My question is because the standard library clearly has this useful > functionality implemented, but simultaneously warns strongly against its > use. But it *doesn't*, if your requirement is truly to not touch the filesystem at all, because tempfile.mktemp() *reads* the file system to make sure the name it's returning isn't in use. What's more, because you're *not* creating the file, mktemp() would be within its rights to return the same file name the second time you call it. If you want something that really doesn't go near the file system and/or is guaranteed to produce multiple different non-existing file names, you'll have to write it yourself. -- Greg
[toc] | [prev] | [next] | [standalone]
| From | Thomas 'PointedEars' Lahn <PointedEars@web.de> |
|---|---|
| Date | 2016-02-16 01:18 +0100 |
| Message-ID | <2015485.VjBY4A5gp9@PointedEars.de> |
| In reply to | #102947 |
Gregory Ewing wrote: > Ben Finney wrote: >> One valid filesystem path each time it's accessed. That is, behaviour >> equivalent to ‘tempfile.mktemp’. >> >> My question is because the standard library clearly has this useful >> functionality implemented, but simultaneously warns strongly against its >> use. > > But it *doesn't*, Yes, it does. > if your requirement is truly to not touch the filesystem at all, because > tempfile.mktemp() *reads* the file system to make sure the name it's > returning isn't in use. But there is a race condition occurring between the moment that the filesystem has been read and is being written to by another user. Hence the deprecation in favor of tempfile.mkstemp() which also *creates* the file instead, and the warning about the security hole if tempfile.mktemp() is used anyway. You can use tempfile.mktemp() only as long as it is irrelevant if a file with that name already exists, or exists later but was not created by you. -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail.
[toc] | [prev] | [next] | [standalone]
| From | Grant Edwards <invalid@invalid.invalid> |
|---|---|
| Date | 2016-02-15 15:49 +0000 |
| Message-ID | <n9ss2g$51v$2@reader1.panix.com> |
| In reply to | #102932 |
On 2016-02-15, Ben Finney <ben+python@benfinney.id.au> wrote:
> Dan Sommers <dan@tombstonezero.net> writes:
>
>> On Mon, 15 Feb 2016 11:08:52 +1100, Ben Finney wrote:
>>
>> > I am unconcerned with whether there is a real filesystem entry of
>> > that name; the goal entails having no filesystem activity for this.
>> > I want a valid unique filesystem path, without touching the
>> > filesystem.
>>
>> That's an odd use case.
>
> It's very common to want filesystem paths divorced from accessing a
> filesystem entry.
If the filesystem paths are not associated with a filesystem, what do
you mean by "unique"? You want to make sure that path <whatever>
which doesn't exist in some filesystem is different from all other
paths that don't exist in some filesystem?
> For example: test paths in a unit test. Filesystem access is orders
> of magnitude slower than accessing fake files in memory only,
How is "fake files in memory" not a filesystem?
--
Grant Edwards grant.b.edwards Yow! The Korean War must
at have been fun.
gmail.com
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2016-02-15 15:06 +1100 |
| Message-ID | <56c14ed7$0$11089$c3e8da3@news.astraweb.com> |
| In reply to | #102925 |
On Monday 15 February 2016 11:08, Ben Finney wrote:
> I am unconcerned with whether there is a real filesystem entry of that
> name; the goal entails having no filesystem activity for this. I want a
> valid unique filesystem path, without touching the filesystem.
Your phrasing is ambiguous.
If you are unconcerned whether or not a file of that name exists, then just
pick a name and use that:
unique_path = /tmp/foo
is guaranteed to be valid on POSIX systems and unique, and it may or may not
exist.
If you actually do care that /tmp/foo *doesn't* exist, then you have a
problem: whatever name you pick *now* may no longer "not exist" a
millisecond later. In general there's no way to create a valid pathname
which doesn't exist *now* and is guaranteed to continue to not exist unless
you touch the file system.
But if you explain in more detail why you want this filename, perhaps we can
come up with some ideas that will help.
--
Steve
[toc] | [prev] | [next] | [standalone]
| From | Ben Finney <ben+python@benfinney.id.au> |
|---|---|
| Date | 2016-02-15 15:28 +1100 |
| Message-ID | <mailman.127.1455510515.22075.python-list@python.org> |
| In reply to | #102935 |
Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes: > On Monday 15 February 2016 11:08, Ben Finney wrote: > > > I am unconcerned with whether there is a real filesystem entry of > > that name; the goal entails having no filesystem activity for this. > > I want a valid unique filesystem path, without touching the > > filesystem. > > Your phrasing is ambiguous. The existing behaviour of ‘tempfile.mktemp’ – actually of its internal class ‘tempfile._RandomNameSequence’ – is to generate unpredictable, unique, valid filesystem paths that are different each time. That's the behaviour I want, in a public API that exposes what ‘tempfile’ already has implemented, documented in a way that doesn't create a scare about security. > But if you explain in more detail why you want this filename, perhaps > we can come up with some ideas that will help. The behaviour is already implemented in the standard library. What I'm looking for is a way to use it (not re-implement it) that is public API and isn't scolded by the library documentation. -- \ “Try adding “as long as you don't breach the terms of service – | `\ according to our sole judgement” to the end of any cloud | _o__) computing pitch.” —Simon Phipps, 2010-12-11 | Ben Finney
[toc] | [prev] | [next] | [standalone]
| From | Gregory Ewing <greg.ewing@canterbury.ac.nz> |
|---|---|
| Date | 2016-02-15 21:11 +1300 |
| Message-ID | <didfgnFt6ivU1@mid.individual.net> |
| In reply to | #102936 |
Ben Finney wrote:
> The existing behaviour of ‘tempfile.mktemp’ – actually of its internal
> class ‘tempfile._RandomNameSequence’ – is to generate unpredictable,
> unique, valid filesystem paths that are different each time.
But that's not documented behaviour, so even if mktemp()
weren't marked as deprecated, you'd still be relying on
undocumented and potentially changeable behaviour.
> What I'm
> looking for is a way to use it (not re-implement it) that is public API
> and isn't scolded by the library documentation.
Then you're looking for something that doesn't exist,
I'm sorry to say, and it's unlikely you'll persuade
anyone to make it exist.
If you want to leverage stdlib functionality for this,
I'd suggest something along the lines of:
def fakefilename(dir, ext):
return os.path.join(dir, str(uuid.uuid4())) + ext
--
Greg
[toc] | [prev] | [next] | [standalone]
| From | Nobody <nobody@nowhere.invalid> |
|---|---|
| Date | 2016-02-16 02:14 +0000 |
| Message-ID | <pan.2016.02.16.02.14.08.635000@nowhere.invalid> |
| In reply to | #102936 |
On Mon, 15 Feb 2016 15:28:27 +1100, Ben Finney wrote: > The behaviour is already implemented in the standard library. What I'm > looking for is a way to use it (not re-implement it) that is public API > and isn't scolded by the library documentation. So, basically you want (essentially) the exact behaviour of tempfile.mktemp(), except without any mention of the (genuine) risks that such a function presents? I suspect that you'll have to settle for either a) using that function and simply documenting the reasons why it isn't an issue in this particular case, or b) re-implementing it (so that you can choose to avoid mentioning the issue in its documentation). At the outside, you *might* have a third option: c) persuade the maintainers to tweak the documentation to further clarify that the risk arises from creating a file with the returned name, not from simply calling the function. But actually it's already fairly clear if you actually read it. If it's the bold-face "Warning:" and the red background that you don't like, I wouldn't expect those to go away either for mktemp() or for any other function with similar behaviour (i.e. something which someone *might* try to use to actually create temporary files). The simple fact that it might get used that way is enough to warrant a prominent warning.
[toc] | [prev] | [next] | [standalone]
| From | "Martin A. Brown" <martin@linux-ip.net> |
|---|---|
| Date | 2016-02-14 20:48 -0800 |
| Message-ID | <mailman.128.1455511750.22075.python-list@python.org> |
| In reply to | #102935 |
Good evening/morning Ben,
>> > I am unconcerned with whether there is a real filesystem entry of
>> > that name; the goal entails having no filesystem activity for this.
>> > I want a valid unique filesystem path, without touching the
>> > filesystem.
>>
>> Your phrasing is ambiguous.
>
>The existing behaviour of ‘tempfile.mktemp’ – actually of its
>internal class ‘tempfile._RandomNameSequence’ – is to generate
>unpredictable, unique, valid filesystem paths that are different
>each time.
>
>That's the behaviour I want, in a public API that exposes what
>‘tempfile’ already has implemented, documented in a way that
>doesn't create a scare about security.
If your code is not actually touching the filesystem, then it will
not be affected by the race condition identified in the
tempfile.mktemp() warning anyway. So, I'm unsure of your worry.
>> But if you explain in more detail why you want this filename, perhaps
>> we can come up with some ideas that will help.
>
>The behaviour is already implemented in the standard library. What
>I'm looking for is a way to use it (not re-implement it) that is
>public API and isn't scolded by the library documentation.
I might also suggest the (bound) method _create_tmp() on class
mailbox.Maildir, which achieves roughly the same goals, but for a
permanent file.
Of course, that particular method also touches the filesystem. The
Maildir naming approach is based on the assumptions* that time is
monotonically increasing, that system nodes never share the same
name and that you don't need more than 1 uniquely named file per
directory per millisecond.
If so, then you can use the 9 or 10 lines of that method.
Good luck,
-Martin
* I was tempted to joke about these two guarantees, but I think
that undermines my basic message. To wit, you can probably rely
on this naming technique about as much as you can rely on your
system clock. I'll assume that you aren't naming all of your
nodes 'franklin.p.gundersnip'.
--
Martin A. Brown
http://linux-ip.net/
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web