Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #69623

Re: Unicode Chars in Windows Path

References (1 earlier) <533cc967$0$2909$c3e8da3$76491128@news.astraweb.com> <mailman.8825.1396494621.18130.python-list@python.org> <87fvluss86.fsf@elektro.pacujo.net> <1396533471.32018.102326165.14B5BB43@webmail.messagingengine.com> <CAPTjJmpQhneBQF2070AS=dfFFCU6E5_O8OFiM2rVf40duLCB-A@mail.gmail.com>
Date 2014-04-04 11:15 +1100
Subject Re: Unicode Chars in Windows Path
From David <bouncingcats@gmail.com>
Newsgroups comp.lang.python
Message-ID <mailman.8868.1396570565.18130.python-list@python.org> (permalink)

Show all headers | View raw


On 4 April 2014 01:17, Chris Angelico <rosuav@gmail.com> wrote:
>
> -- Get info on all .pyc files in a directory and all its subdirectories --
> C:\>dir some_directory\*.pyc /s
> $ ls -l `find some_directory -name \*.pyc`
>
> Except that the ls version there can't handle names with spaces in
> them, so you need to faff around with null termination and stuff.

Nooo, that stinks! There's no need to abuse 'find' like that, unless
the version you have is truly ancient. Null termination is only
necessary to pass 'find' results *via the shell*. Instead, ask 'find'
to invoke the task itself.

The simplest way is:

    find some_directory -name '*.pyc' -ls

'find' is the tool to use for *finding* things, not 'ls', which is
intended for terminal display of directory information.

If you require a particular feature of 'ls', or any other command, you
can ask 'find' to invoke it directly (not via a shell):

    find some_directory -name '*.pyc' -exec ls -l {} \;

'Find' is widely under utilised and poorly understood because its
command line syntax is extremely confusing compared to other tools,
plus its documentation compounds the confusion. For anyone interested,
I offer these key insights:

Most important to understand is that the -name, -exec and -ls that I
used above (for example) are *not* command-line "options". Even though
they look like command-line options, they aren't. They are part of an
*expression* in 'find' syntax. And the crucial difference is that the
expression is order-dependent. So unlike most other commands, it is a
mistake to put them in arbitrary order.

Also annoyingly, the -exec syntax utilises characters that must be
escaped from shell processing. This is more arcane knowledge that just
frustrates people when they are in a rush to get something done.

In fact, the only command-line *options* that 'find' takes are -H -L
-P -D and -O, but these are rarely used. They come *before* the
directory name(s). Everything that comes after the directory name is
part of a 'find' expression.

But, the most confusing thing of all, in the 'find' documentation,
expressions are composed of tests, actions, and ... options! These
so-called options are expression-options, not command-line-options. No
wonder everyone's confused, when one word describes two
similar-looking but behaviourally different things!

So 'info find' must be read very carefully indeed. But it is
worthwhile, because in the model of "do one thing and do it well",
'find' is the tool intended for such tasks, rather than expecting
these capabilities to be built into all other command line utilities.

I know this is off-topic but because I learn so much from the
countless terrific contributions to this list from Chris (and others)
with wide expertise, I am motivated to give something back when I can.
And given that in the past I spent a little time and effort and
eventually understood this, I summarise it here hoping it helps
someone else. The unix-style tools are far more capable than the
Microsoft shell when used as intended.

There is good documentation on find at: http://mywiki.wooledge.org/UsingFind

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Unicode Chars in Windows Path Steve <sreisscruz@gmail.com> - 2014-04-02 16:27 -0700
  Re: Unicode Chars in Windows Path Steven D'Aprano <steve@pearwood.info> - 2014-04-03 02:37 +0000
    Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-03 14:10 +1100
      Re: Unicode Chars in Windows Path Marko Rauhamaa <marko@pacujo.net> - 2014-04-03 12:00 +0300
        Re: Unicode Chars in Windows Path Peter Otten <__peter__@web.de> - 2014-04-03 15:09 +0200
        Re: Unicode Chars in Windows Path random832@fastmail.us - 2014-04-03 09:57 -0400
        Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-04 01:17 +1100
        Re: Unicode Chars in Windows Path David <bouncingcats@gmail.com> - 2014-04-04 11:15 +1100
        Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-04 12:16 +1100
        Re: Unicode Chars in Windows Path David <bouncingcats@gmail.com> - 2014-04-04 13:02 +1100
    Re: Unicode Chars in Windows Path Terry Reedy <tjreedy@udel.edu> - 2014-04-03 14:41 -0400
    Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-04 09:06 +1100
    Re: Unicode Chars in Windows Path Lele Gaifax <lele@metapensiero.it> - 2014-04-04 09:07 +0200
  Re: Unicode Chars in Windows Path alister <alister.nospam.ware@ntlworld.com> - 2014-04-03 08:35 +0000

csiph-web