Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #69559 > unrolled thread

Unicode Chars in Windows Path

Started bySteve <sreisscruz@gmail.com>
First post2014-04-02 16:27 -0700
Last post2014-04-03 08:35 +0000
Articles 14 — 10 participants

Back to article view | Back to comp.lang.python


Contents

  Unicode Chars in Windows Path Steve <sreisscruz@gmail.com> - 2014-04-02 16:27 -0700
    Re: Unicode Chars in Windows Path Steven D'Aprano <steve@pearwood.info> - 2014-04-03 02:37 +0000
      Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-03 14:10 +1100
        Re: Unicode Chars in Windows Path Marko Rauhamaa <marko@pacujo.net> - 2014-04-03 12:00 +0300
          Re: Unicode Chars in Windows Path Peter Otten <__peter__@web.de> - 2014-04-03 15:09 +0200
          Re: Unicode Chars in Windows Path random832@fastmail.us - 2014-04-03 09:57 -0400
          Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-04 01:17 +1100
          Re: Unicode Chars in Windows Path David <bouncingcats@gmail.com> - 2014-04-04 11:15 +1100
          Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-04 12:16 +1100
          Re: Unicode Chars in Windows Path David <bouncingcats@gmail.com> - 2014-04-04 13:02 +1100
      Re: Unicode Chars in Windows Path Terry Reedy <tjreedy@udel.edu> - 2014-04-03 14:41 -0400
      Re: Unicode Chars in Windows Path Chris Angelico <rosuav@gmail.com> - 2014-04-04 09:06 +1100
      Re: Unicode Chars in Windows Path Lele Gaifax <lele@metapensiero.it> - 2014-04-04 09:07 +0200
    Re: Unicode Chars in Windows Path alister <alister.nospam.ware@ntlworld.com> - 2014-04-03 08:35 +0000

#69559 — Unicode Chars in Windows Path

FromSteve <sreisscruz@gmail.com>
Date2014-04-02 16:27 -0700
SubjectUnicode Chars in Windows Path
Message-ID<f3b4238a-6bf4-478e-9326-1ba239d5237f@googlegroups.com>
Hi All,

I'm in need of some encoding/decoding help for a situation for a Windows Path that contains Unicode characters in it.

---- CODE ----

import os.path
import codecs
import sys

All_Tests = [u"c:\automation_common\Python\TestCases\list_dir_script.txt"]


for curr_test in All_Tests:                             
  print("\n raw : " + repr(curr_test) + "\n")
  print("\n encode : %s \n\n" ) %  os.path.normpath(codecs.encode(curr_test, "ascii"))
  print("\n decode : %s \n\n" ) %  curr_test.decode('string_escape')

---- CODE ----


Screen Output : 

 raw : u'c:\x07utomation_common\\Python\\TestCases\\list_dir_script.txt'

 encode : c:utomation_common\Python\TestCases\list_dir_script.txt

 decode : c:utomation_common\Python\TestCases\list_dir_script.txt


My goal is to have the properly formatting path in the output :

 
c:\automation_common\Python\TestCases\list_dir_script.txt


What is the "magic" encode/decode sequence here??

Thanks!

Steve

[toc] | [next] | [standalone]


#69565

FromSteven D'Aprano <steve@pearwood.info>
Date2014-04-03 02:37 +0000
Message-ID<533cc967$0$2909$c3e8da3$76491128@news.astraweb.com>
In reply to#69559
On Wed, 02 Apr 2014 16:27:04 -0700, Steve wrote:

> Hi All,
> 
> I'm in need of some encoding/decoding help for a situation for a Windows
> Path that contains Unicode characters in it.
> 
> ---- CODE ----
> 
> import os.path
> import codecs
> import sys
> 
> All_Tests =
> [u"c:\automation_common\Python\TestCases\list_dir_script.txt"]

I don't think this has anything to do with Unicode encoding or decoding. 
In Python string literals, the backslash makes the next character 
special. So \n makes a newline, \t makes a tab, and so forth. Only if the 
character being backslashed has no special meaning does Python give you a 
literal backslash:

py> print("x\tx")
x	x
py> print("x\Tx")
x\Tx


In this case, \a has special meaning, and is converted to the ASCII BEL 
control character:

py> u"...\automation"
u'...\x07utomation'


When working with Windows paths, you should make a habit of either 
escaping every backslash:

    u"c:\\automation_common\\Python\\TestCases\\list_dir_script.txt"

using a raw-string:

    ur"c:\automation_common\Python\TestCases\list_dir_script.txt"

or just use forward slashes:

    u"c:/automation_common/Python/TestCases/list_dir_script.txt"


Windows accepts both forward and backslashes in file names.


If you fix that issue, I expect your problem will go away.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#69567

FromChris Angelico <rosuav@gmail.com>
Date2014-04-03 14:10 +1100
Message-ID<mailman.8825.1396494621.18130.python-list@python.org>
In reply to#69565
On Thu, Apr 3, 2014 at 1:37 PM, Steven D'Aprano <steve@pearwood.info> wrote:
> Windows accepts both forward and backslashes in file names.

Small clarification: The Windows *API* accepts both types of slash
(you can open a file using forward slashes, for instance), but not all
Windows *applications* are aware of this (generally only
cross-platform ones take notice of this), and most Windows *users*
prefer backslashes. So when you come to display a Windows path, you
may want to convert to backslashes. But that's for display.

ChrisA

[toc] | [prev] | [next] | [standalone]


#69573

FromMarko Rauhamaa <marko@pacujo.net>
Date2014-04-03 12:00 +0300
Message-ID<87fvluss86.fsf@elektro.pacujo.net>
In reply to#69567
Chris Angelico <rosuav@gmail.com>:

> Small clarification: The Windows *API* accepts both types of slash
> (you can open a file using forward slashes, for instance), but not all
> Windows *applications* are aware of this (generally only
> cross-platform ones take notice of this), and most Windows *users*
> prefer backslashes. So when you come to display a Windows path, you
> may want to convert to backslashes. But that's for display.

Didn't know that. More importantly, I had thought forward slashes were
valid file basename characters, but Windows is surprisingly strict about
that:

   < > : " / \ | ? * NUL

are not allowed in basenames. Unix/linux disallows only:

  / NUL

In fact, proper dealing with punctuation in pathnames is one of the main
reasons to migrate to Python from bash. Even if it is often possible to
write bash scripts that handle arbitrary pathnames correctly, few script
writers are pedantic enough to do it properly. For example, newlines in
filenames are bound to confuse 99.9% of bash scripts.


Marko

[toc] | [prev] | [next] | [standalone]


#69584

FromPeter Otten <__peter__@web.de>
Date2014-04-03 15:09 +0200
Message-ID<mailman.8837.1396533381.18130.python-list@python.org>
In reply to#69573
Marko Rauhamaa wrote:

> Chris Angelico <rosuav@gmail.com>:
> 
>> Small clarification: The Windows *API* accepts both types of slash
>> (you can open a file using forward slashes, for instance), but not all
>> Windows *applications* are aware of this (generally only
>> cross-platform ones take notice of this), and most Windows *users*
>> prefer backslashes. So when you come to display a Windows path, you
>> may want to convert to backslashes. But that's for display.
> 
> Didn't know that. More importantly, I had thought forward slashes were
> valid file basename characters, but Windows is surprisingly strict about
> that:
> 
>    < > : " / \ | ? * NUL
> 
> are not allowed in basenames. Unix/linux disallows only:
> 
>   / NUL
> 
> In fact, proper dealing with punctuation in pathnames is one of the main
> reasons to migrate to Python from bash. Even if it is often possible to
> write bash scripts that handle arbitrary pathnames correctly, few script
> writers are pedantic enough to do it properly. For example, newlines in
> filenames are bound to confuse 99.9% of bash scripts.

That doesn't bother me much as 99.8% of all bash scripts are already 
confused by ordinary space chars ;)

[toc] | [prev] | [next] | [standalone]


#69585

Fromrandom832@fastmail.us
Date2014-04-03 09:57 -0400
Message-ID<mailman.8838.1396533473.18130.python-list@python.org>
In reply to#69573
On Thu, Apr 3, 2014, at 5:00, Marko Rauhamaa wrote:
> In fact, proper dealing with punctuation in pathnames is one of the main
> reasons to migrate to Python from bash. Even if it is often possible to
> write bash scripts that handle arbitrary pathnames correctly, few script
> writers are pedantic enough to do it properly. For example, newlines in
> filenames are bound to confuse 99.9% of bash scripts.

Incidentally, these rules mean there are different norms about how
command line arguments are parsed on windows. Since * and ? are not
allowed in filenames, you don't have to care whether they were quoted.
An argument [in a position where a list of filenames is expected] with *
or ? in it _always_ gets globbed, so "C:\dir with spaces\*.txt" can be
used. This is part of the reason the program is responsible for globbing
rather than the shell - because only the program knows if it expects a
list of filenames in that position vs a text string for some other
purpose.

This is unfortunate, because it means that most python programs do not
handle filename patterns at all (expecting the shell to do it for them)
- it would be nice if there was a cross-platform way to do this.

Native windows wildcards are also weird in a number of ways not emulated
by the glob module. Most of these are not expected by users, but some
users may expect, for example, *.htm to match files ending in .html; *.*
to match files with no dot in them, and *. to match _only_ files with no
dot in them. The latter two are guaranteed by the windows API, the first
is merely common due to default shortname settings. Also, native windows
wildcards do not support [character classes].

[toc] | [prev] | [next] | [standalone]


#69589

FromChris Angelico <rosuav@gmail.com>
Date2014-04-04 01:17 +1100
Message-ID<mailman.8842.1396534649.18130.python-list@python.org>
In reply to#69573
On Fri, Apr 4, 2014 at 12:57 AM,  <random832@fastmail.us> wrote:
> An argument [in a position where a list of filenames is expected] with *
> or ? in it _always_ gets globbed, so "C:\dir with spaces\*.txt" can be
> used. This is part of the reason the program is responsible for globbing
> rather than the shell - because only the program knows if it expects a
> list of filenames in that position vs a text string for some other
> purpose.

Which, I might mention, is part of why the old DOS way (still
applicable under Windows, but I first met it with MS-DOS) of searching
for files was more convenient than it can be with Unix tools. Compare:

-- Get info on all .pyc files in a directory --
C:\>dir some_directory\*.pyc
$ ls -l some_directory/*.pyc

So far, so good.

-- Get info on all .pyc files in a directory and all its subdirectories --
C:\>dir some_directory\*.pyc /s
$ ls -l `find some_directory -name \*.pyc`

Except that the ls version there can't handle names with spaces in
them, so you need to faff around with null termination and stuff. With
bash, you can use 'shopt -s globstar; ls -l **/*.py', but that's not a
default-active option (at least, it's not active on any of the systems
I use, but they're all Debians and Ubuntus; it might be active by
default on others), and I suspect a lot of people don't even know it
exists; I know of it, but don't always think of it, and often end up
doing the above flawed version.

On the flip side, having the shell handle it does mean you
automatically get this on *any* command. You can go and delete all
those .pyc files by just changing "ls -l" into "rm" or "dir" into
"del", but that's only because del happens to support /s; other DOS
programs may well not.

ChrisA

[toc] | [prev] | [next] | [standalone]


#69623

FromDavid <bouncingcats@gmail.com>
Date2014-04-04 11:15 +1100
Message-ID<mailman.8868.1396570565.18130.python-list@python.org>
In reply to#69573
On 4 April 2014 01:17, Chris Angelico <rosuav@gmail.com> wrote:
>
> -- Get info on all .pyc files in a directory and all its subdirectories --
> C:\>dir some_directory\*.pyc /s
> $ ls -l `find some_directory -name \*.pyc`
>
> Except that the ls version there can't handle names with spaces in
> them, so you need to faff around with null termination and stuff.

Nooo, that stinks! There's no need to abuse 'find' like that, unless
the version you have is truly ancient. Null termination is only
necessary to pass 'find' results *via the shell*. Instead, ask 'find'
to invoke the task itself.

The simplest way is:

    find some_directory -name '*.pyc' -ls

'find' is the tool to use for *finding* things, not 'ls', which is
intended for terminal display of directory information.

If you require a particular feature of 'ls', or any other command, you
can ask 'find' to invoke it directly (not via a shell):

    find some_directory -name '*.pyc' -exec ls -l {} \;

'Find' is widely under utilised and poorly understood because its
command line syntax is extremely confusing compared to other tools,
plus its documentation compounds the confusion. For anyone interested,
I offer these key insights:

Most important to understand is that the -name, -exec and -ls that I
used above (for example) are *not* command-line "options". Even though
they look like command-line options, they aren't. They are part of an
*expression* in 'find' syntax. And the crucial difference is that the
expression is order-dependent. So unlike most other commands, it is a
mistake to put them in arbitrary order.

Also annoyingly, the -exec syntax utilises characters that must be
escaped from shell processing. This is more arcane knowledge that just
frustrates people when they are in a rush to get something done.

In fact, the only command-line *options* that 'find' takes are -H -L
-P -D and -O, but these are rarely used. They come *before* the
directory name(s). Everything that comes after the directory name is
part of a 'find' expression.

But, the most confusing thing of all, in the 'find' documentation,
expressions are composed of tests, actions, and ... options! These
so-called options are expression-options, not command-line-options. No
wonder everyone's confused, when one word describes two
similar-looking but behaviourally different things!

So 'info find' must be read very carefully indeed. But it is
worthwhile, because in the model of "do one thing and do it well",
'find' is the tool intended for such tasks, rather than expecting
these capabilities to be built into all other command line utilities.

I know this is off-topic but because I learn so much from the
countless terrific contributions to this list from Chris (and others)
with wide expertise, I am motivated to give something back when I can.
And given that in the past I spent a little time and effort and
eventually understood this, I summarise it here hoping it helps
someone else. The unix-style tools are far more capable than the
Microsoft shell when used as intended.

There is good documentation on find at: http://mywiki.wooledge.org/UsingFind

[toc] | [prev] | [next] | [standalone]


#69626

FromChris Angelico <rosuav@gmail.com>
Date2014-04-04 12:16 +1100
Message-ID<mailman.8871.1396574228.18130.python-list@python.org>
In reply to#69573
On Fri, Apr 4, 2014 at 11:15 AM, David <bouncingcats@gmail.com> wrote:
> On 4 April 2014 01:17, Chris Angelico <rosuav@gmail.com> wrote:
>>
>> -- Get info on all .pyc files in a directory and all its subdirectories --
>> C:\>dir some_directory\*.pyc /s
>> $ ls -l `find some_directory -name \*.pyc`
>>
>> Except that the ls version there can't handle names with spaces in
>> them, so you need to faff around with null termination and stuff.
>
> Nooo, that stinks! There's no need to abuse 'find' like that, unless
> the version you have is truly ancient. Null termination is only
> necessary to pass 'find' results *via the shell*. Instead, ask 'find'
> to invoke the task itself.
>
> The simplest way is:
>
>     find some_directory -name '*.pyc' -ls
>
> 'find' is the tool to use for *finding* things, not 'ls', which is
> intended for terminal display of directory information.

I used ls only as a first example, and then picked up an extremely
common next example (deleting files). It so happens that find can
'-delete' its found files, but my point is that on DOS/Windows, every
command has to explicitly support subdirectories. If, instead, the
'find' command has to explicitly support everything you might want to
do to files, that's even worse! So we need an execution form...

> If you require a particular feature of 'ls', or any other command, you
> can ask 'find' to invoke it directly (not via a shell):
>
>     find some_directory -name '*.pyc' -exec ls -l {} \;

... which this looks like, but it's not equivalent. That will execute
'ls -l' once for each file. You can tell, because the columns aren't
aligned; for anything more complicated than simply 'ls -l', you
potentially destroy any chance at bulk operations. No, to be
equivalent it *must* pass all the args to a single invocation of the
program. You need to instead use xargs if you want it to be
equivalent, and it's now getting to be quite an incantation:

find some_directory -name \*.pyc -print0|xargs -0 ls -l

And *that* is equivalent to the original, but it's way *way* longer
and less convenient, which was my point. Plus, it's extremely tempting
to shorten that, because this will almost always work:

find some_directory -name \*.pyc|xargs ls -l

But it'll fail if you have newlines in file names. It'd probably work
every time you try it, and then you'll put that in a script and boom,
it stops working. (That's what I meant by "faffing about with null
termination". You have to go through an extra level of indirection,
making the command fairly unwieldy.)

> I know this is off-topic but because I learn so much from the
> countless terrific contributions to this list from Chris (and others)
> with wide expertise, I am motivated to give something back when I can.

Definitely! This is how we all learn :) And thank you, glad to hear that.

> And given that in the past I spent a little time and effort and
> eventually understood this, I summarise it here hoping it helps
> someone else. The unix-style tools are far more capable than the
> Microsoft shell when used as intended.

More specifically, the Unix model ("each tool should do one thing and
do it well") tends to make for more combinable tools. The DOS style
requires every program to reimplement the same directory-search
functionality, and then requires the user to figure out how it's been
written in this form ("zip -r" (or is it "zip -R"...), "dir /s", "del
/s", etc, etc). The Unix style requires applications to accept
arbitrary numbers of arguments (which they probably would anyway), and
then requires the user to learn some incantations that will then work
anywhere. If you're writing a script, you should probably use the
-print0|xargs -0 method (unless you already require bash for some
other reason); interactively, you more likely want to enable globstar
and use the much shorter double-star notation. Either way, it works
for any program, and that is indeed "far more capable".

ChrisA

[toc] | [prev] | [next] | [standalone]


#69629

FromDavid <bouncingcats@gmail.com>
Date2014-04-04 13:02 +1100
Message-ID<mailman.8872.1396576970.18130.python-list@python.org>
In reply to#69573
On 4 April 2014 12:16, Chris Angelico <rosuav@gmail.com> wrote:
> On Fri, Apr 4, 2014 at 11:15 AM, David <bouncingcats@gmail.com> wrote:
>> On 4 April 2014 01:17, Chris Angelico <rosuav@gmail.com> wrote:
>>>
>>> -- Get info on all .pyc files in a directory and all its subdirectories --
>>> C:\>dir some_directory\*.pyc /s
>>> $ ls -l `find some_directory -name \*.pyc`
>>>
>>> Except that the ls version there can't handle names with spaces in
>>> them, so you need to faff around with null termination and stuff.
>>
>> Nooo, that stinks! There's no need to abuse 'find' like that, unless
>> the version you have is truly ancient. Null termination is only
>> necessary to pass 'find' results *via the shell*. Instead, ask 'find'
>> to invoke the task itself.
>>
>> The simplest way is:
>>
>>     find some_directory -name '*.pyc' -ls
>>
>> 'find' is the tool to use for *finding* things, not 'ls', which is
>> intended for terminal display of directory information.
>
> I used ls only as a first example, and then picked up an extremely
> common next example (deleting files). It so happens that find can
> '-delete' its found files, but my point is that on DOS/Windows, every
> command has to explicitly support subdirectories. If, instead, the
> 'find' command has to explicitly support everything you might want to
> do to files, that's even worse! So we need an execution form...
>
>> If you require a particular feature of 'ls', or any other command, you
>> can ask 'find' to invoke it directly (not via a shell):
>>
>>     find some_directory -name '*.pyc' -exec ls -l {} \;
>
> ... which this looks like, but it's not equivalent.

> That will execute
> 'ls -l' once for each file. You can tell, because the columns aren't
> aligned; for anything more complicated than simply 'ls -l', you
> potentially destroy any chance at bulk operations.

Thanks for elaborating that point. But still ...

> equivalent it *must* pass all the args to a single invocation of the
> program. You need to instead use xargs if you want it to be
> equivalent, and it's now getting to be quite an incantation:
>
> find some_directory -name \*.pyc -print0|xargs -0 ls -l
>
> And *that* is equivalent to the original, but it's way *way* longer
> and less convenient, which was my point.

If you are not already aware, it might interest you that 'find' in
(GNU findutils) 4.4.2. has

 -- Action: -execdir command {} +
     This works as for `-execdir command ;', except that the `{}' at
     the end of the command is expanded to a list of names of matching
     files.  This expansion is done in such a way as to avoid exceeding
     the maximum command line length available on the system.  Only one
     `{}' is allowed within the command, and it must appear at the end,
     immediately before the `+'.  A `+' appearing in any position other
     than immediately after `{}' is not considered to be special (that
     is, it does not terminate the command).

I believe that achieves the goal, without involving the shell.

It also has an -exec equivalent that works the same but has an
unrelated security issue and not recommended.

But if that '+' instead of ';' feature is not available on the
target system, then as far as I am aware it would be necessary
to use xargs as you say.

Anyway, the two points I wished to contribute are:

1) It is preferable to avoid shell command substitutions (the
backticks in the first example) and expansions where possible.

2) My observations on 'find' syntax, for anyone interested.

Cheers,
David

[toc] | [prev] | [next] | [standalone]


#69603

FromTerry Reedy <tjreedy@udel.edu>
Date2014-04-03 14:41 -0400
Message-ID<mailman.8851.1396550508.18130.python-list@python.org>
In reply to#69565
On 4/2/2014 11:10 PM, Chris Angelico wrote:
> On Thu, Apr 3, 2014 at 1:37 PM, Steven D'Aprano <steve@pearwood.info> wrote:
>> Windows accepts both forward and backslashes in file names.
>
> Small clarification: The Windows *API* accepts both types of slash

To me, that is what Steven said.

> (you can open a file using forward slashes, for instance), but not all
> Windows *applications* are aware of this (generally only
> cross-platform ones take notice of this), and most Windows *users*
> prefer backslashes.

Do you have a source for that?

> So when you come to display a Windows path, you
> may want to convert to backslashes. But that's for display.

-- 
Terry Jan Reedy

[toc] | [prev] | [next] | [standalone]


#69614

FromChris Angelico <rosuav@gmail.com>
Date2014-04-04 09:06 +1100
Message-ID<mailman.8859.1396562772.18130.python-list@python.org>
In reply to#69565
On Fri, Apr 4, 2014 at 5:41 AM, Terry Reedy <tjreedy@udel.edu> wrote:
> On 4/2/2014 11:10 PM, Chris Angelico wrote:
>>
>> On Thu, Apr 3, 2014 at 1:37 PM, Steven D'Aprano <steve@pearwood.info>
>> wrote:
>>>
>>> Windows accepts both forward and backslashes in file names.
>>
>>
>> Small clarification: The Windows *API* accepts both types of slash
>
>
> To me, that is what Steven said.

Yes, which is why I said "clarification" not "correction".

>> (you can open a file using forward slashes, for instance), but not all
>> Windows *applications* are aware of this (generally only
>> cross-platform ones take notice of this), and most Windows *users*
>> prefer backslashes.
>
>
> Do you have a source for that?

Hardly need one for the first point - it's proven by a single Windows
application that parses a path name by dividing it on backslashes.
Even if there isn't one today, I could simply write one, and prove my
own point trivially (albeit not usefully). Anything that simply passes
its arguments to an API (eg it just opens the file) won't need to take
notice of slash type, but otherwise, it's very *VERY* common for a
Windows program to assume that it can split paths manually.

The second point would be better sourced, yes, but all I can say is
that I've written programs that use and display slashes, and had
non-programmers express surprise at it; similarly when you see certain
programs that take one part of a path literally, and then build on it
with either type of slash, like zip and unzip - if you say "zip -r
C:\Foo\Bar", it'll tell you that it's archiving
C:\Foo\Bar/Quux/Asdf.txt and so on. Definitely inspires surprise in
non-programmers.

ChrisA

[toc] | [prev] | [next] | [standalone]


#69648

FromLele Gaifax <lele@metapensiero.it>
Date2014-04-04 09:07 +0200
Message-ID<mailman.8880.1396595277.18130.python-list@python.org>
In reply to#69565
Steven D'Aprano <steve@pearwood.info> writes:

> When working with Windows paths, you should make a habit of either 
> escaping every backslash:
>
>     u"c:\\automation_common\\Python\\TestCases\\list_dir_script.txt"
>
> using a raw-string:
>
>     ur"c:\automation_common\Python\TestCases\list_dir_script.txt"
>
> or just use forward slashes:
>
>     u"c:/automation_common/Python/TestCases/list_dir_script.txt"

The latter should be preferred, in case Python3 compatibility is a goal.

ciao, lele.
-- 
nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri
real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia.
lele@metapensiero.it  |                 -- Fortunato Depero, 1929.

[toc] | [prev] | [next] | [standalone]


#69572

Fromalister <alister.nospam.ware@ntlworld.com>
Date2014-04-03 08:35 +0000
Message-ID<u39%u.262024$cZ2.232691@fx22.am4>
In reply to#69559
On Wed, 02 Apr 2014 16:27:04 -0700, Steve wrote:

> Hi All,
> 
> I'm in need of some encoding/decoding help for a situation for a Windows
> Path that contains Unicode characters in it.
> 
> ---- CODE ----
> 
> import os.path import codecs import sys
> 
> All_Tests =
> [u"c:\automation_common\Python\TestCases\list_dir_script.txt"]
> 
> 
> for curr_test in All_Tests:
>   print("\n raw : " + repr(curr_test) + "\n")
>   print("\n encode : %s \n\n" ) % 
>   os.path.normpath(codecs.encode(curr_test, "ascii"))
>   print("\n decode : %s \n\n" ) %  curr_test.decode('string_escape')
> 
> ---- CODE ----
> 
> 
> Screen Output :
> 
>  raw : u'c:\x07utomation_common\\Python\\TestCases\\list_dir_script.txt'
> 
>  encode : c:utomation_common\Python\TestCases\list_dir_script.txt
> 
>  decode : c:utomation_common\Python\TestCases\list_dir_script.txt
> 
> 
> My goal is to have the properly formatting path in the output :
> 
> 
> c:\automation_common\Python\TestCases\list_dir_script.txt
> 
> 
> What is the "magic" encode/decode sequence here??
> 
> Thanks!
> 
> Steve

you have imported os.path you will find it contains a number of functions 
to make this task easier*

import os.path
root=os.path.sep
drive='c:'
path= [root,
	'automation_common','python',
	'TestCases','list_dir_script.txt']
all_tests=os.path.join(drive,*path)

works happily even in linux (the un-necessary drive letter simply gets 
stripped)

*easier to maintain cross platform compatibility

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web