Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #28836 > unrolled thread

how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py

Started byruck <john.ruckstuhl@gmail.com>
First post2012-09-10 10:25 -0700
Last post2012-11-09 16:42 -0800
Articles 11 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-10 10:25 -0700
    Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-10 20:16 +0000
      Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-10 15:22 -0700
        Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-09-11 03:46 +0000
          Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Tim Golden <mail@timgolden.me.uk> - 2012-09-11 08:20 +0100
            Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-11 12:13 -0700
              Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Chris Angelico <rosuav@gmail.com> - 2012-09-12 08:50 +1000
              Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Dave Angel <d@davea.name> - 2012-09-11 18:57 -0400
            Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py ruck <john.ruckstuhl@gmail.com> - 2012-09-11 12:13 -0700
          Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py Thomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> - 2012-09-12 08:17 +0200
            Re: how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py aahz@pythoncraft.com (Aahz) - 2012-11-09 16:42 -0800

#28836 — how to get os.py to use an ./ntpath.py instead of Lib/ntpath.py

Fromruck <john.ruckstuhl@gmail.com>
Date2012-09-10 10:25 -0700
Subjecthow to get os.py to use an ./ntpath.py instead of Lib/ntpath.py
Message-ID<a4a2e63b-d25d-4487-bff8-67d4b7c40cbc@googlegroups.com>
In Python 2.7.2 on Windows 7,

os.walk() uses isdir(),
which comes from os.path,
which really comes from ntpath.py,
which really comes from genericpath.py

I want os.walk() to use a modified isdir() on my Windows 7.
Not knowing any better, it seems to me like ntpath.py would be a good place to intercept.

When os.py does "import ntpath as path",
how can I get python to process my customized ntpath.py 
instead of Lib/ntpath.py ?

Thanks for any comments.
John

BTW, here's my mod to ntpath.py:
    $ diff ntpath.py.standard ntpath.py
    14c14,19
    < from genericpath import *
    ---
    > from genericpath import *
    >
    > def isdir(s):
    >     return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
    > def isfile(s):
    >     return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))

Why?  Because the genericpath implementation relies on os.stat() which
uses Windows API function that presumes or enforces some naming
conventions like "doesn't end with a space or a period".
But the NTFS actually supports such filenames and dirnames, and some sw
(like cygwin) lets users make files & dirs without restricting.
So, cygwin users like me may have file 'voo...\\doo' which os.walk()
cannot ordinarily walk.  That is, the isdir('voo...') returns false
because the underlying os.stat is assessing 'voo' instead of 'voo...' .
The workaround is to pass os.stat a fullpathname that is prefixed
with r'\\?\' so the Windows API recognizes that you do NOT want the
name filtered.

Better said by Microsoft:
"For file I/O, the "\\?\" prefix to a path string tells 
the Windows APIs to disable all string parsing and to 
send the string that follows it straight to the file 
system. For example, if the file system supports large 
paths and file names, you can exceed the MAX_PATH limits 
that are otherwise enforced by the Windows APIs." 

[toc] | [next] | [standalone]


#28854

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-09-10 20:16 +0000
Message-ID<504e4a8d$0$29981$c3e8da3$5496439d@news.astraweb.com>
In reply to#28836
On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:

> In Python 2.7.2 on Windows 7,
> 
> os.walk() uses isdir(),
> which comes from os.path,
> which really comes from ntpath.py,
> which really comes from genericpath.py
> 
> I want os.walk() to use a modified isdir() on my Windows 7. Not knowing
> any better, it seems to me like ntpath.py would be a good place to
> intercept.
> 
> When os.py does "import ntpath as path", how can I get python to process
> my customized ntpath.py instead of Lib/ntpath.py ?

import os
os.path.isdir = my_isdir

ought to do it.

This general technique is called "monkey-patching". The Ruby community is 
addicted to it. Everybody else -- and a goodly number of the more 
sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out 
of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix 
bugs.

They are right to be suspicious of it. As a general rule, monkey-patching 
is not for production code. You have been warned.

http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html


[...]
> Why?  Because the genericpath implementation relies on os.stat() which
> uses Windows API function that presumes or enforces some naming
> conventions like "doesn't end with a space or a period". But the NTFS
> actually supports such filenames and dirnames, and some sw (like cygwin)
> lets users make files & dirs without restricting. So, cygwin users like
> me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk. 
> That is, the isdir('voo...') returns false because the underlying
> os.stat is assessing 'voo' instead of 'voo...' . 

Please consider submitting a patch that adds support for cygwin paths to 
the standard library. You'll need to target 3.4 though, 2.7 is now a 
maintenance release with no new features allowed.


> The workaround is to
> pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows
> API recognizes that you do NOT want the name filtered.
> 
> Better said by Microsoft:
> "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs
> to disable all string parsing and to send the string that follows it
> straight to the file system.

That's not so much a workaround as the officially supported API for 
dealing with the situation you are in. Why don't you just prepend a '?' 
to paths like they tell you to?


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#28862

Fromruck <john.ruckstuhl@gmail.com>
Date2012-09-10 15:22 -0700
Message-ID<83af64e3-bc26-4217-8afa-e4f6d45b604d@googlegroups.com>
In reply to#28854
On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:
> On Mon, 10 Sep 2012 10:25:29 -0700, ruck wrote:
> 
> 
> 
> > In Python 2.7.2 on Windows 7,
> 
> > 
> 
> > os.walk() uses isdir(),
> 
> > which comes from os.path,
> 
> > which really comes from ntpath.py,
> 
> > which really comes from genericpath.py
> 
> > 
> 
> > I want os.walk() to use a modified isdir() on my Windows 7. Not knowing
> 
> > any better, it seems to me like ntpath.py would be a good place to
> 
> > intercept.
> 
> > 
> 
> > When os.py does "import ntpath as path", how can I get python to process
> 
> > my customized ntpath.py instead of Lib/ntpath.py ?
> 
> 
> 
> import os
> 
> os.path.isdir = my_isdir
> 
> 
> 
> ought to do it.
> 
> 
> 
> This general technique is called "monkey-patching". The Ruby community is 
> 
> addicted to it. Everybody else -- and a goodly number of the more 
> 
> sensible Ruby crowd -- consider it a risky, dirty hack that 99 times out 
> 
> of 100 will lead to blindness, moral degeneracy and subtle, hard-to-fix 
> 
> bugs.
> 
> 
> 
> They are right to be suspicious of it. As a general rule, monkey-patching 
> 
> is not for production code. You have been warned.
> 
> 
> 
> http://www.codinghorror.com/blog/2008/07/monkeypatching-for-humans.html
> 
> 
> 
> 
> 
> [...]
> 
> > Why?  Because the genericpath implementation relies on os.stat() which
> 
> > uses Windows API function that presumes or enforces some naming
> 
> > conventions like "doesn't end with a space or a period". But the NTFS
> 
> > actually supports such filenames and dirnames, and some sw (like cygwin)
> 
> > lets users make files & dirs without restricting. So, cygwin users like
> 
> > me may have file 'voo...\\doo' which os.walk() cannot ordinarily walk. 
> 
> > That is, the isdir('voo...') returns false because the underlying
> 
> > os.stat is assessing 'voo' instead of 'voo...' . 
> 
> 
> 
> Please consider submitting a patch that adds support for cygwin paths to 
> 
> the standard library. You'll need to target 3.4 though, 2.7 is now a 
> 
> maintenance release with no new features allowed.
> 
> 
> 
> 
> 
> > The workaround is to
> 
> > pass os.stat a fullpathname that is prefixed with r'\\?\' so the Windows
> 
> > API recognizes that you do NOT want the name filtered.
> 
> > 
> 
> > Better said by Microsoft:
> 
> > "For file I/O, the "\\?\" prefix to a path string tells the Windows APIs
> 
> > to disable all string parsing and to send the string that follows it
> 
> > straight to the file system.
> 
> 
> 
> That's not so much a workaround as the officially supported API for 
> 
> dealing with the situation you are in. Why don't you just prepend a '?' 
> 
> to paths like they tell you to?
> 
> 
> 
> 
> 
> -- 
> 
> Steven

Steven says:
 That's not so much a workaround as the officially supported API for 
 dealing with the situation you are in. Why don't you just prepend a '?' 
 to paths like they tell you to?

Good idea, but the first thing os.walk() does is a listdir(), and os.listdir() does not like the r'\\?\' prefix.  In other words, 
os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') 
does not work.

Also, your recipe worked for me -- 
I'm walking 'goo' which contains 'voo.../doo'

    import os

    import genericpath
    def my_isdir(s):
        return genericpath.isdir('\\\\?\\' + os.path.abspath(s + '\\'))

    print 'os.walk(\'goo\') with standard isdir()'
    for root, dirs, files in os.walk('goo'):
        print root, dirs, files

    print 'os.walk(\'goo\') with modified isdir()'
    os.path.isdir = my_isdir
    for root, dirs, files in os.walk('goo'):
        print root, dirs, files

yields

    os.walk('goo') with standard isdir()
    goo [] ['voo...']
    os.walk('goo') with modified isdir()
    goo ['voo...'] []
    goo\voo... [] ['doo']

About monkeypatching, generally -- thanks for the pointer to that discussion.  That sounded like a lot of wisdom and lessons learned being shared.
About me suggesting a patch -- I'll sleep on that :)

Thanks Steven!
John

[toc] | [prev] | [next] | [standalone]


#28868

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-09-11 03:46 +0000
Message-ID<504eb3fc$0$29890$c3e8da3$5496439d@news.astraweb.com>
In reply to#28862
On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote:

> On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:
[...]
> > That's not so much a workaround as the officially supported API for
> > dealing with the situation you are in. Why don't you just prepend a 
> > '?' to paths like they tell you to?
> 
> Good idea, but the first thing os.walk() does is a listdir(), and
> os.listdir() does not like the r'\\?\' prefix.  In other words,
> os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work.

Now that sounds like a bug to me. If Microsoft officially support 
leading ? in file names, then so should Python on Windows.


> Also, your recipe worked for me --
> I'm walking 'goo' which contains 'voo.../doo'

Good for you. (Sorry, that comes across as more condescending than it is 
intended as.) Monkey-patching often gets used for quick scripts and tiny 
pieces of code because it works.

Just beware that if you extend that technique to larger bodies of code, 
say when using a large framework, or multiple libraries, your experience 
may not be quite so good. Especially if *they* are monkey-patching too, 
as some very large frameworks sometimes do. (Or so I am lead to believe.)

The point is not that monkey-patching is dangerous and should never be 
used, but that it is risky and should be used with caution.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#28877

FromTim Golden <mail@timgolden.me.uk>
Date2012-09-11 08:20 +0100
Message-ID<mailman.489.1347348083.27098.python-list@python.org>
In reply to#28868
On 11/09/2012 04:46, Steven D'Aprano wrote:
> On Mon, 10 Sep 2012 15:22:05 -0700, ruck wrote:
> 
>> On Monday, September 10, 2012 1:16:13 PM UTC-7, Steven D'Aprano wrote:
> [...]
>>> That's not so much a workaround as the officially supported API for
>>> dealing with the situation you are in. Why don't you just prepend a 
>>> '?' to paths like they tell you to?
>>
>> Good idea, but the first thing os.walk() does is a listdir(), and
>> os.listdir() does not like the r'\\?\' prefix.  In other words,
>> os.walk(r'\\?\C:Users\john\Desktop\sandbox\goo') does not work.
> 
> Now that sounds like a bug to me. If Microsoft officially support 
> leading ? in file names, then so should Python on Windows.

And so it does, but you'll notice from the MSDN docs that the \\?
syntax must be supplied as a Unicode string, which os.listdir
will do if you pass it a Python unicode object and not otherwise:

import os
os.listdir(u"\\\\?\\c:\\users")

# and consequently

for p, ds, fs in os.walk(u"\\\\?\\c:\\users"):
  print p


TJG

[toc] | [prev] | [next] | [standalone]


#28898

Fromruck <john.ruckstuhl@gmail.com>
Date2012-09-11 12:13 -0700
Message-ID<9ef20f77-487f-4250-91af-5d7d2491da05@googlegroups.com>
In reply to#28877
On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
> And so it does, but you'll notice from the MSDN docs that the \\?
> syntax must be supplied as a Unicode string, which os.listdir
> will do if you pass it a Python unicode object and not otherwise:

I was saying os.listdir doesn't like the r'\\?\' prefix.
But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
Good:
    >>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
   [u'voo...']
Bad:
    >>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')

    Traceback (most recent call last):
      File "<pyshell#3>", line 1, in <module>
        os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
    WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'

Thanks to both of you for taking the time to teach.

BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.

Now I guess I understand why.  I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path.  So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.

To get my custom ntpath.py honored, need to RELOAD, like:
  import os
  import ntpath
  reload(ntpath)
  print 'os.walk(\'goo\') with isdir override in custom ntpath'
  for root, dirs, files in os.walk('goo'):
      print root, dirs, files

where the diff betw standard ntpath.py and my ntpath.py are:
  14c14,19
  < from genericpath import *
  ---
  > from genericpath import *
  >
  > def isdir(s):
  >     return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
  > def isfile(s):
  >     return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))

I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.

Thanks again for the help.
John

[toc] | [prev] | [next] | [standalone]


#28909

FromChris Angelico <rosuav@gmail.com>
Date2012-09-12 08:50 +1000
Message-ID<mailman.525.1347403815.27098.python-list@python.org>
In reply to#28898
On Wed, Sep 12, 2012 at 5:13 AM, ruck <john.ruckstuhl@gmail.com> wrote:
> I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.

One way to find out is to peek at the cache.

>>> import sys
>>> sys.modules

There are quite a few of them in the 3.2 interactive that I just tried this in.

ChrisA

[toc] | [prev] | [next] | [standalone]


#28911

FromDave Angel <d@davea.name>
Date2012-09-11 18:57 -0400
Message-ID<mailman.527.1347404243.27098.python-list@python.org>
In reply to#28898
On 09/11/2012 03:13 PM, ruck wrote:
> <snip>
>
> I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.
>

import sys
print sys.modules



-- 

DaveA

[toc] | [prev] | [next] | [standalone]


#28899

Fromruck <john.ruckstuhl@gmail.com>
Date2012-09-11 12:13 -0700
Message-ID<mailman.515.1347390828.27098.python-list@python.org>
In reply to#28877
On Tuesday, September 11, 2012 12:21:24 AM UTC-7, Tim Golden wrote:
> And so it does, but you'll notice from the MSDN docs that the \\?
> syntax must be supplied as a Unicode string, which os.listdir
> will do if you pass it a Python unicode object and not otherwise:

I was saying os.listdir doesn't like the r'\\?\' prefix.
But Tim corrects me -- so yes, Steven's earler suggestion "Why don't you just prepend a '?' to paths like they tell you to?" does work, when I supply it in unicode.
Good:
    >>> os.listdir(u'\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
   [u'voo...']
Bad:
    >>> os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')

    Traceback (most recent call last):
      File "<pyshell#3>", line 1, in <module>
        os.listdir('\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo')
    WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect: '\\\\?\\C:\\Users\\john\\Desktop\\sandbox\\goo/*.*'

Thanks to both of you for taking the time to teach.

BTW, when I posted the original, I was trying to supply my own customized ntpath module, and I was really puzzled as to why it wasn't getting picked up! According to sys.path I expected my custom ntpath.py to be chosen, instead of the standard Lib/ntpath.py.

Now I guess I understand why.  I moved Lib/ntpath.* out of the way, and learned that during initialization, Python is importing "site" module, which is importing "os" which is importing "ntpath" -- before my dir is added to sys.path.  So later when I import os, it and ntpath have already been imported, so Python doesn't attempt a fresh import.

To get my custom ntpath.py honored, need to RELOAD, like:
  import os
  import ntpath
  reload(ntpath)
  print 'os.walk(\'goo\') with isdir override in custom ntpath'
  for root, dirs, files in os.walk('goo'):
      print root, dirs, files

where the diff betw standard ntpath.py and my ntpath.py are:
  14c14,19
  < from genericpath import *
  ---
  > from genericpath import *
  >
  > def isdir(s):
  >     return genericpath.isdir('\\\\?\\' + abspath(s + '\\'))
  > def isfile(s):
  >     return genericpath.isfile('\\\\?\\' + abspath(s + '\\'))

I'm not sure how I could have known that ntpath was already imported, since *I* didn't import it, but that was the key to my confusion.

Thanks again for the help.
John

[toc] | [prev] | [next] | [standalone]


#28932

FromThomas Rachel <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de>
Date2012-09-12 08:17 +0200
Message-ID<k2p9da$ktu$1@r03.glglgl.gl>
In reply to#28868
Am 11.09.2012 05:46 schrieb Steven D'Aprano:

> Good for you. (Sorry, that comes across as more condescending than it is
> intended as.) Monkey-patching often gets used for quick scripts and tiny
> pieces of code because it works.
>
> Just beware that if you extend that technique to larger bodies of code,
> say when using a large framework, or multiple libraries, your experience
> may not be quite so good. Especially if *they* are monkey-patching too,
> as some very large frameworks sometimes do. (Or so I am lead to believe.)

This sonds like a good use case for a context manager, like the one in 
decimal.Context.get_manager().

First shot:

@contextlib.contextmanager
def changed_os_path(**k):
     old = {}
     try:
         for i in k.items():
             old[i] = getattr(os.path, i)
             setattr(os.path, i, k[i])
         yield None
     finally:
         for i in k.items():
             setattr(os.path, i, old[i])

and so for your code you can use

     print 'os.walk(\'goo\') with modified isdir()'
     with changed_os_path(isdir=my_isdir):
         for root, dirs, files in os.walk('goo'):
             print root, dirs, files

so the change is only effective as long as you are in the relevant code 
part and is reverted as soon as you leave it.


Thomas

[toc] | [prev] | [next] | [standalone]


#33066

Fromaahz@pythoncraft.com (Aahz)
Date2012-11-09 16:42 -0800
Message-ID<k7k7u3$rr0$1@panix5.panix.com>
In reply to#28932
In article <k2p9da$ktu$1@r03.glglgl.gl>,
Thomas Rachel  <nutznetz-0c1b6768-bfa9-48d5-a470-7603bd3aa915@spamschutz.glglgl.de> wrote:
>Am 11.09.2012 05:46 schrieb Steven D'Aprano:
>>
>> Good for you. (Sorry, that comes across as more condescending than it is
>> intended as.) Monkey-patching often gets used for quick scripts and tiny
>> pieces of code because it works.
>>
>> Just beware that if you extend that technique to larger bodies of code,
>> say when using a large framework, or multiple libraries, your experience
>> may not be quite so good. Especially if *they* are monkey-patching too,
>> as some very large frameworks sometimes do. (Or so I am lead to believe.)
>
>This sonds like a good use case for a context manager, like the one in 
>decimal.Context.get_manager().

Note that because get_manager() applies to a specific Context instance it
is safe in a threaded application, which is NOT true for monkey-patching
modules even with a context manager.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

"....Normal is what cuts off your sixth finger and your tail..."  --Siobhan

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web