Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #28836 > unrolled thread
| Started by | ruck <john.ruckstuhl@gmail.com> |
|---|---|
| First post | 2012-09-10 10:25 -0700 |
| Last post | 2012-11-09 16:42 -0800 |
| Articles | 11 — 7 participants |
Back to article view | Back to comp.lang.python
how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-10 10:25 -0700
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-10 20:16 +0000
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-10 15:22 -0700
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-11 03:46 +0000
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Tim Golden <mail@timgolden.me.uk> - 2012-09-11 08:20 +0100
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-11 12:13 -0700
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Chris Angelico <rosuav@gmail.com> - 2012-09-12 08:50 +1000
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Dave Angel <d@davea.name> - 2012-09-11 18:57 -0400
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-11 12:13 -0700
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2012-09-12 08:17 +0200
Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py aahz@pythoncraft.com (Aahz) - 2012-11-09 16:42 -0800
| From | ruck <john.ruckstuhl@gmail.com> |
|---|---|
| Date | 2012-09-10 10:25 -0700 |
| Subject | how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py |
| Message-ID | <a4a2e63b-d25d-4487-bff8-67d4b7c40cbc@googlegroups.com> |
In Python 2.7.2 on Windows 7,
os.walk() uses isdir(),
which comes from os.path,
which really comes from ntpath.py,
which really comes from genericpath.py
I want os.walk() to use a modified isdir() on my Windows 7.
Not knowing any better, it seems to me like ntpath.py would be a good place to intercept.
When os.py does "import ntpath as path",
how can I get python to process my customized ntpath.py
instead of Lib/ntpath.py ?
Thanks for any comments.
John
BTW, here's my mod to ntpath.py:
$ diff ntpath.py.standard ntpath.py
14c14,19
< from genericpath import *
---
> from genericpath import *
>
> def isdir(s):
> return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
> def isfile(s):
> return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))
Why? Because the genericpath implementation relies on os.stat() which
uses Windows API function that presumes or enforces some naming
conventions like "doesn't end with a space or a period".
But the NTFS actually supports such filenames and dirnames, and some sw
(like cygwin) lets users make files & dirs without restricting.
So, cygwin users like me may have file 'voo...\\doo' which os.walk()
cannot ordinarily walk. That is, the isdir('voo...') returns false
because the underlying os.stat is assessing 'voo' instead of 'voo...' .
The workaround is to pass os.stat a fullpathname that is prefixed
with r'\\?\' so the Windows API recognizes that you do NOT want the
name filtered.
Better said by Microsoft:
"For file I/O, the "\\?\" prefix to a path string tells
the Windows APIs to disable all string parsing and to
send the string that follows it straight to the file
system. For example, if the file system supports large
paths and file names, you can exceed the MAX_PATH limits
that are otherwise enforced by the Windows APIs."
[toc] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-09-10 20:16 +0000 |
| Message-ID | <504e4a8d$0$29981$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #28836 |
On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:
> In Python 2.7.2 on Windows 7,
>
> os.walk() uses isdir(),
> which comes from os.path,
> which really comes from ntpath.py,
> which really comes from genericpath.py
>
> I want os.walk() to use a modified isdir() on my Windows 7. Not knowing
> any better, it seems to me like ntpath.py would be a good place to
> intercept.
>
> When os.py does "import ntpath as path", how can I get python to process
> my customized ntpath.py instead of Lib/ntpath.py ?
import os
os.path.isdir = my_isdir
ought to do it.
This general technique is called "monkey-patching". The Ruby community is
addicted to it. Everybody else -- and a goodly number of the more
sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out
of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix
bugs.
They are right to be suspicious of it. As a general rule, monkey-patching
is not for production code. You have been warned.
http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html
[...]
> Why? Because the genericpath implementation relies on os.stat() which
> uses Windows API function that presumes or enforces some naming
> conventions like "doesn't end with a space or a period". But the NTFS
> actually supports such filenames and dirnames, and some sw (like cygwin)
> lets users make files & dirs without restricting. So, cygwin users like
> me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk.
> That is, the isdir('voo...') returns false because the underlying
> os.stat is assessing 'voo' instead of 'voo...' .
Please consider submitting a patch that adds support for cygwin paths to
the standard library. You'll need to target 3.4 though, 2.7 is now a
maintenance release with no new features allowed.
> The workaround is to
> pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows
> API recognizes that you do NOT want the name filtered.
>
> Better said by Microsoft:
> "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs
> to disable all string parsing and to send the string that follows it
> straight to the file system.
That's not so much a workaround as the officially supported API for
dealing with the situation you are in. Why don't you just prepend a '?'
to paths like they tell you to?
--
Steven
[toc] | [prev] | [next] | [standalone]
| From | ruck <john.ruckstuhl@gmail.com> |
|---|---|
| Date | 2012-09-10 15:22 -0700 |
| Message-ID | <83af64e3-bc26-4217-8afa-e4f6d45b604d@googlegroups.com> |
| In reply to | #28854 |
On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:
> On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:
>
>
>
> > In Python 2.7.2 on Windows 7,
>
> >
>
> > os.walk() uses isdir(),
>
> > which comes from os.path,
>
> > which really comes from ntpath.py,
>
> > which really comes from genericpath.py
>
> >
>
> > I want os.walk() to use a modified isdir() on my Windows 7. Not knowing
>
> > any better, it seems to me like ntpath.py would be a good place to
>
> > intercept.
>
> >
>
> > When os.py does "import ntpath as path", how can I get python to process
>
> > my customized ntpath.py instead of Lib/ntpath.py ?
>
>
>
> import os
>
> os.path.isdir = my_isdir
>
>
>
> ought to do it.
>
>
>
> This general technique is called "monkey-patching". The Ruby community is
>
> addicted to it. Everybody else -- and a goodly number of the more
>
> sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out
>
> of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix
>
> bugs.
>
>
>
> They are right to be suspicious of it. As a general rule, monkey-patching
>
> is not for production code. You have been warned.
>
>
>
> http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html
>
>
>
>
>
> [...]
>
> > Why? Because the genericpath implementation relies on os.stat() which
>
> > uses Windows API function that presumes or enforces some naming
>
> > conventions like "doesn't end with a space or a period". But the NTFS
>
> > actually supports such filenames and dirnames, and some sw (like cygwin)
>
> > lets users make files & dirs without restricting. So, cygwin users like
>
> > me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk.
>
> > That is, the isdir('voo...') returns false because the underlying
>
> > os.stat is assessing 'voo' instead of 'voo...' .
>
>
>
> Please consider submitting a patch that adds support for cygwin paths to
>
> the standard library. You'll need to target 3.4 though, 2.7 is now a
>
> maintenance release with no new features allowed.
>
>
>
>
>
> > The workaround is to
>
> > pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows
>
> > API recognizes that you do NOT want the name filtered.
>
> >
>
> > Better said by Microsoft:
>
> > "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs
>
> > to disable all string parsing and to send the string that follows it
>
> > straight to the file system.
>
>
>
> That's not so much a workaround as the officially supported API for
>
> dealing with the situation you are in. Why don't you just prepend a '?'
>
> to paths like they tell you to?
>
>
>
>
>
> --
>
> Steven
Steven says:
That's not so much a workaround as the officially supported API for
dealing with the situation you are in. Why don't you just prepend a '?'
to paths like they tell you to?
Good idea, but the first thing os.walk() does is a listdir(), and os.listdir() does not like the r'\\?\' prefix. In other words,
os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo')
does not work.
Also, your recipe worked for me --
I'm walking 'goo' which contains 'voo.../doo'
import os
import genericpath
def my_isdir(s):
return genericpath.isdir('\\\\?\\' + os.path.abspath(s + '\\'))
print 'os.walk(\'goo\') with standard isdir()'
for root, dirs, files in os.walk('goo'):
print root, dirs, files
print 'os.walk(\'goo\') with modified isdir()'
os.path.isdir = my_isdir
for root, dirs, files in os.walk('goo'):
print root, dirs, files
yields
os.walk('goo') with standard isdir()
goo [] ['voo...']
os.walk('goo') with modified isdir()
goo ['voo...'] []
goo\voo... [] ['doo']
About monkeypatching, generally -- thanks for the pointer to that discussion. That sounded like a lot of wisdom and lessons learned being shared.
About me suggesting a patch -- I'll sleep on that :)
Thanks Steven!
John
[toc] | [prev] | [next] | [standalone]
| From | Steven D'Aprano <steve+comp.lang.python@pearwood.info> |
|---|---|
| Date | 2012-09-11 03:46 +0000 |
| Message-ID | <504eb3fc$0$29890$c3e8da3$5496439d@news.astraweb.com> |
| In reply to | #28862 |
On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote: > On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote: [...] > > That's not so much a workaround as the officially supported API for > > dealing with the situation you are in. Why don't you just prepend a > > '?' to paths like they tell you to? > > Good idea, but the first thing os.walk() does is a listdir(), and > os.listdir() does not like the r'\\?\' prefix. In other words, > os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work. Now that sounds like a bug to me. If Microsoft officially support leading ? in file names, then so should Python on Windows. > Also, your recipe worked for me -- > I'm walking 'goo' which contains 'voo.../doo' Good for you. (Sorry, that comes across as more condescending than it is intended as.) Monkey-patching often gets used for quick scripts and tiny pieces of code because it works. Just beware that if you extend that technique to larger bodies of code, say when using a large framework, or multiple libraries, your experience may not be quite so good. Especially if *they* are monkey-patching too, as some very large frameworks sometimes do. (Or so I am lead to believe.) The point is not that monkey-patching is dangerous and should never be used, but that it is risky and should be used with caution. -- Steven
[toc] | [prev] | [next] | [standalone]
| From | Tim Golden <mail@timgolden.me.uk> |
|---|---|
| Date | 2012-09-11 08:20 +0100 |
| Message-ID | <mailman.489.1347348083.27098.python-list@python.org> |
| In reply to | #28868 |
On 11/09/2012 04:46, Steven D'Aprano wrote: > On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote: > >> On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote: > [...] >>> That's not so much a workaround as the officially supported API for >>> dealing with the situation you are in. Why don't you just prepend a >>> '?' to paths like they tell you to? >> >> Good idea, but the first thing os.walk() does is a listdir(), and >> os.listdir() does not like the r'\\?\' prefix. In other words, >> os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work. > > Now that sounds like a bug to me. If Microsoft officially support > leading ? in file names, then so should Python on Windows. And so it does, but you'll notice from the MSDN docs that the \\? syntax must be supplied as a Unicode string, which os.listdir will do if you pass it a Python unicode object and not otherwise: import os os.listdir(u"\\\\?\\c:\\users") # and consequently for p, ds, fs in os.walk(u"\\\\?\\c:\\users"): print p TJG
[toc] | [prev] | [next] | [standalone]
| From | ruck <john.ruckstuhl@gmail.com> |
|---|---|
| Date | 2012-09-11 12:13 -0700 |
| Message-ID | <9ef20f77-487f-4250-91af-5d7d2491da05@googlegroups.com> |
| In reply to | #28877 |
On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
> And so it does, but you'll notice from the MSDN docs that the \\?
> syntax must be supplied as a Unicode string, which os.listdir
> will do if you pass it a Python unicode object and not otherwise:
I was saying os.listdir doesn't like the r'\\?\' prefix.
But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
Good:
>>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
[u'voo...']
Bad:
>>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'
Thanks to both of you for taking the time to teach.
BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.
Now I guess I understand why. I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path. So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.
To get my custom ntpath.py honored, need to RELOAD, like:
import os
import ntpath
reload(ntpath)
print 'os.walk(\'goo\') with isdir override in custom ntpath'
for root, dirs, files in os.walk('goo'):
print root, dirs, files
where the diff betw standard ntpath.py and my ntpath.py are:
14c14,19
< from genericpath import *
---
> from genericpath import *
>
> def isdir(s):
> return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
> def isfile(s):
> return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))
I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.
Thanks again for the help.
John
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2012-09-12 08:50 +1000 |
| Message-ID | <mailman.525.1347403815.27098.python-list@python.org> |
| In reply to | #28898 |
On Wed, Sep 12, 2012 at 5:13 AM, ruck <john.ruckstuhl@gmail.com> wrote: > I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion. One way to find out is to peek at the cache. >>> import sys >>> sys.modules There are quite a few of them in the 3.2 interactive that I just tried this in. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Dave Angel <d@davea.name> |
|---|---|
| Date | 2012-09-11 18:57 -0400 |
| Message-ID | <mailman.527.1347404243.27098.python-list@python.org> |
| In reply to | #28898 |
On 09/11/2012 03:13 PM, ruck wrote: > <snip> > > I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion. > import sys print sys.modules -- DaveA
[toc] | [prev] | [next] | [standalone]
| From | ruck <john.ruckstuhl@gmail.com> |
|---|---|
| Date | 2012-09-11 12:13 -0700 |
| Message-ID | <mailman.515.1347390828.27098.python-list@python.org> |
| In reply to | #28877 |
On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
> And so it does, but you'll notice from the MSDN docs that the \\?
> syntax must be supplied as a Unicode string, which os.listdir
> will do if you pass it a Python unicode object and not otherwise:
I was saying os.listdir doesn't like the r'\\?\' prefix.
But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
Good:
>>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
[u'voo...']
Bad:
>>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'
Thanks to both of you for taking the time to teach.
BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.
Now I guess I understand why. I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path. So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.
To get my custom ntpath.py honored, need to RELOAD, like:
import os
import ntpath
reload(ntpath)
print 'os.walk(\'goo\') with isdir override in custom ntpath'
for root, dirs, files in os.walk('goo'):
print root, dirs, files
where the diff betw standard ntpath.py and my ntpath.py are:
14c14,19
< from genericpath import *
---
> from genericpath import *
>
> def isdir(s):
> return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
> def isfile(s):
> return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))
I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.
Thanks again for the help.
John
[toc] | [prev] | [next] | [standalone]
| From | Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> |
|---|---|
| Date | 2012-09-12 08:17 +0200 |
| Message-ID | <k2p9da$ktu$1@r03.glglgl.gl> |
| In reply to | #28868 |
Am 11.09.2012 05:46 schrieb Steven D'Aprano:
> Good for you. (Sorry, that comes across as more condescending than it is
> intended as.) Monkey-patching often gets used for quick scripts and tiny
> pieces of code because it works.
>
> Just beware that if you extend that technique to larger bodies of code,
> say when using a large framework, or multiple libraries, your experience
> may not be quite so good. Especially if *they* are monkey-patching too,
> as some very large frameworks sometimes do. (Or so I am lead to believe.)
This sonds like a good use case for a context manager, like the one in
decimal.Context.get_manager().
First shot:
@contextlib.contextmanager
def changed_os_path(**k):
old = {}
try:
for i in k.items():
old[i] = getattr(os.path, i)
setattr(os.path, i, k[i])
yield None
finally:
for i in k.items():
setattr(os.path, i, old[i])
and so for your code you can use
print 'os.walk(\'goo\') with modified isdir()'
with changed_os_path(isdir=my_isdir):
for root, dirs, files in os.walk('goo'):
print root, dirs, files
so the change is only effective as long as you are in the relevant code
part and is reverted as soon as you leave it.
Thomas
[toc] | [prev] | [next] | [standalone]
| From | aahz@pythoncraft.com (Aahz) |
|---|---|
| Date | 2012-11-09 16:42 -0800 |
| Message-ID | <k7k7u3$rr0$1@panix5.panix.com> |
| In reply to | #28932 |
In article <k2p9da$ktu$1@r03.glglgl.gl>, Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> wrote: >Am 11.09.2012 05:46 schrieb Steven D'Aprano: >> >> Good for you. (Sorry, that comes across as more condescending than it is >> intended as.) Monkey-patching often gets used for quick scripts and tiny >> pieces of code because it works. >> >> Just beware that if you extend that technique to larger bodies of code, >> say when using a large framework, or multiple libraries, your experience >> may not be quite so good. Especially if *they* are monkey-patching too, >> as some very large frameworks sometimes do. (Or so I am lead to believe.) > >This sonds like a good use case for a context manager, like the one in >decimal.Context.get_manager(). Note that because get_manager() applies to a specific Context instance it is safe in a threaded application, which is NOT true for monkey-patching modules even with a context manager. -- Aahz (aahz@pythoncraft.com) <*> http://www.pythoncraft.com/ "....Normal is what cuts off your sixth finger and your tail..." --Siobhan
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web