Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #100414 > unrolled thread
| Started by | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| First post | 2015-12-14 16:24 +0000 |
| Last post | 2015-12-18 01:37 -0500 |
| Articles | 19 — 8 participants |
Back to article view | Back to comp.lang.python
cannot open file with non-ASCII filename Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-12-14 16:24 +0000
Re: cannot open file with non-ASCII filename Terry Reedy <tjreedy@udel.edu> - 2015-12-14 13:34 -0500
Re: cannot open file with non-ASCII filename wxjmfauth@gmail.com - 2015-12-14 11:07 -0800
Re: cannot open file with non-ASCII filename eryk sun <eryksun@gmail.com> - 2015-12-14 12:45 -0600
Re: cannot open file with non-ASCII filename Laura Creighton <lac@openend.se> - 2015-12-14 19:51 +0100
Re: cannot open file with non-ASCII filename Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-12-14 22:11 +0000
Re: cannot open file with non-ASCII filename Thomas 'PointedEars' Lahn <PointedEars@web.de> - 2015-12-14 23:41 +0100
Re: cannot open file with non-ASCII filename Laura Creighton <lac@openend.se> - 2015-12-15 01:07 +0100
Re: cannot open file with non-ASCII filename eryk sun <eryksun@gmail.com> - 2015-12-14 21:20 -0600
Re: cannot open file with non-ASCII filename eryk sun <eryksun@gmail.com> - 2015-12-14 17:55 -0600
Re: cannot open file with non-ASCII filename Laura Creighton <lac@openend.se> - 2015-12-15 01:13 +0100
Re: cannot open file with non-ASCII filename Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-12-15 08:26 +0000
Re: cannot open file with non-ASCII filename Laura Creighton <lac@openend.se> - 2015-12-15 15:09 +0100
Re: cannot open file with non-ASCII filename eryk sun <eryksun@gmail.com> - 2015-12-15 09:34 -0600
Re: cannot open file with non-ASCII filename Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-12-15 17:04 +0000
Re: cannot open file with non-ASCII filename eryk sun <eryksun@gmail.com> - 2015-12-16 21:39 -0600
Re: cannot open file with non-ASCII filename smap <askme.first@thankyouverymuch.invalid> - 2015-12-18 21:15 +0000
cannot open file with non-ASCII filename bearmingo <bearmingo@gmail.com> - 2015-12-17 21:12 -0800
Re: cannot open file with non-ASCII filename Terry Reedy <tjreedy@udel.edu> - 2015-12-18 01:37 -0500
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-12-14 16:24 +0000 |
| Subject | cannot open file with non-ASCII filename |
| Message-ID | <n4mqgm$v1d$1@news2.informatik.uni-stuttgart.de> |
With Python 2.7.11 on Windows 7 my users cannot open/read files with
non-ASCII filenames. They use the Windows explorer to drag&drop files into
a console window running the Python program.
os.path.exists() does not detect such a file and an open() fails, too.
My code:
print("\nDrag&drop files or directories into this window.")
system('explorer "%s"' % HOME)
file = get_paste()
if not(os.path.exists(file)): die('"%s" does not exist' % file)
def get_paste():
import msvcrt
while True:
c = msvcrt.getch()
if c == '\t': return ''
if c == '\003' or c == '\004': return None
if not (c == '\n' or c == '\r'): break
paste = c
while msvcrt.kbhit():
c = msvcrt.getch()
if c == '\n' or c == '\r': break
paste += c
if match(r'\s',paste): paste = subst('^"(.+)"$',r'\1',paste)
return paste
--
Ullrich Horlacher Server und Virtualisierung
Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart Tel: ++49-711-68565868
Allmandring 30a Fax: ++49-711-682357
70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-12-14 13:34 -0500 |
| Message-ID | <mailman.12.1450118127.14916.python-list@python.org> |
| In reply to | #100414 |
On 12/14/2015 11:24 AM, Ulli Horlacher wrote: > With Python 2.7.11 on Windows 7 my users cannot open/read files with > non-ASCII filenames. Right. They should either restrict themselves to ascii (or possibly latin-1) filenames or use current 3.x. This is one of the (known) unicode problems fixed in 3.x by making unicode the core text class, replacing the implementation of unicode, and performing further work with the new implementation. -- Terry Jan Reedy
[toc] | [prev] | [next] | [standalone]
| From | wxjmfauth@gmail.com |
|---|---|
| Date | 2015-12-14 11:07 -0800 |
| Message-ID | <f5a442d3-a17e-4f67-b866-5d77605c4462@googlegroups.com> |
| In reply to | #100417 |
Le lundi 14 décembre 2015 19:35:49 UTC+1, Terry Reedy a écrit : > On 12/14/2015 11:24 AM, Ulli Horlacher wrote: > > With Python 2.7.11 on Windows 7 my users cannot open/read files with > > non-ASCII filenames. > > Right. They should either restrict themselves to ascii (or possibly > latin-1) filenames or use current 3.x. This is one of the (known) > unicode problems fixed in 3.x by making unicode the core text class, > replacing the implementation of unicode, and performing further work > with the new implementation. > > -- > Terry Jan Reedy Sorry, but no. >>> sys.version '2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]' >>> with open(r'd:\éüÄñoe.txt', 'r') as f: ... r = f.read() ... >>> print r, len(r) éabcéoe EURO z 9 >>>
[toc] | [prev] | [next] | [standalone]
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-14 12:45 -0600 |
| Message-ID | <mailman.14.1450118773.14916.python-list@python.org> |
| In reply to | #100414 |
On Mon, Dec 14, 2015 at 10:24 AM, Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote: > With Python 2.7.11 on Windows 7 my users cannot open/read files with > non-ASCII filenames. [...] > c = msvcrt.getch() This isn't an issue with Python per se, and the same problem exists in Python 3, using either getch or getwch. Microsoft's getwch function isn't designed to handle the variety of ways the console host (conhost.exe) encodes Unicode keyboard events. Their implementation calls ReadConsoleInput and looks for a KEY_EVENT. If bKeyDown is set it grabs the UnicodeChar field. In an ideal world it would be that simple. However, the console literally supports the alt+numpad sequences that allow entering characters by code. So the input event sequence, for example, could be +VK_MENU, +VK_NUMPAD7, -VK_NUMPAD7, +VK_NUMPAD6, -VK_NUMPAD6, -VK_MENU, which is an "L". (Denoting "+" as key down and "-" as key up.) This may just be the closest approximation in the system locale's codepage (ANSI). That doesn't matter because the actual Unicode codepoint is set in the last event's UnicodeChar field. Try using the pyreadline module. IIRC, it does a better job decoding the events from ReadConsoleInput.
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-12-14 19:51 +0100 |
| Message-ID | <mailman.15.1450119109.14916.python-list@python.org> |
| In reply to | #100414 |
In a message of Mon, 14 Dec 2015 13:34:56 -0500, Terry Reedy writes: >On 12/14/2015 11:24 AM, Ulli Horlacher wrote: >> With Python 2.7.11 on Windows 7 my users cannot open/read files with >> non-ASCII filenames. > >Right. They should either restrict themselves to ascii (or possibly >latin-1) filenames or use current 3.x. This is one of the (known) >unicode problems fixed in 3.x by making unicode the core text class, >replacing the implementation of unicode, and performing further work >with the new implementation. > >-- >Terry Jan Reedy > >-- >https://mail.python.org/mailman/listinfo/python-list Given that Ulli is in Germany, latin-1 is likely to work fine for him. And you do it like this: # -*- coding: latin-1 -*- from Tkinter import * root = Tk() s = 'Välkommen till Göteborg' # Welcome to Gothenburg (where I live) u = unicode(s, 'iso8859-1') Label(root, text=u).pack() root.mainloop() Laura
[toc] | [prev] | [next] | [standalone]
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-12-14 22:11 +0000 |
| Message-ID | <n4nepp$4nb$1@news2.informatik.uni-stuttgart.de> |
| In reply to | #100420 |
Laura Creighton <lac@openend.se> wrote: > Given that Ulli is in Germany, latin-1 is likely to work fine for him. For me, but not for my users. We have people from about 100 nations at our university. > And you do it like this: > > # -*- coding: latin-1 -*- > from Tkinter import * > root = Tk() > s = 'Välkommen till Göteborg' # Welcome to Gothenburg (where I live) > u = unicode(s, 'iso8859-1') > Label(root, text=u).pack() The problem is the input of these filenames. -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de Universitaet Stuttgart Tel: ++49-711-68565868 Allmandring 30a Fax: ++49-711-682357 70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [prev] | [next] | [standalone]
| From | Thomas 'PointedEars' Lahn <PointedEars@web.de> |
|---|---|
| Date | 2015-12-14 23:41 +0100 |
| Message-ID | <4412672.gUIyRlH8Kf@PointedEars.de> |
| In reply to | #100431 |
Ulli Horlacher wrote: > Laura Creighton <lac@openend.se> wrote: >> Given that Ulli is in Germany, latin-1 is likely to work fine for him. > > For me, but not for my users. We have people from about 100 nations at our > university. > […] > The problem is the input of these filenames. Why do you have to use msvcrt? I would use curses for user input, but: ,-<https://docs.python.org/2/howto/curses.html?highlight=user%20input> ,-<https://docs.python.org/3.2/howto/curses.html?highlight=user%20input> | | No one has made a Windows port of the curses module. On a Windows | platform, try the Console module written by Fredrik Lundh. The Console | module provides cursor-addressable text output, plus full support for | mouse and keyboard input, and is available from | http://effbot.org/zone/console-index.htm. So you should try that instead. -- PointedEars Twitter: @PointedEars2 Please do not cc me. / Bitte keine Kopien per E-Mail.
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-12-15 01:07 +0100 |
| Message-ID | <mailman.7.1450138083.22044.python-list@python.org> |
| In reply to | #100432 |
In a message of Mon, 14 Dec 2015 23:41:21 +0100, "Thomas 'PointedEars' Lahn" wr ites: >Why do you have to use msvcrt? > >I would use curses for user input, but: > >,-<https://docs.python.org/2/howto/curses.html?highlight=user%20input> >,-<https://docs.python.org/3.2/howto/curses.html?highlight=user%20input> >| >| No one has made a Windows port of the curses module. On a Windows >| platform, try the Console module written by Fredrik Lundh. The Console >| module provides cursor-addressable text output, plus full support for >| mouse and keyboard input, and is available from >| http://effbot.org/zone/console-index.htm. > >So you should try that instead. If going for curses, I'd try this instead: http://pdcurses.sourceforge.net/ Laura
[toc] | [prev] | [next] | [standalone]
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-14 21:20 -0600 |
| Message-ID | <mailman.10.1450149661.22044.python-list@python.org> |
| In reply to | #100432 |
On Mon, Dec 14, 2015 at 6:07 PM, Laura Creighton <lac@openend.se> wrote: > In a message of Mon, 14 Dec 2015 23:41:21 +0100, "Thomas 'PointedEars' Lahn" wr > ites: > >>Why do you have to use msvcrt? >> >>I would use curses for user input, but: >> >>,-<https://docs.python.org/2/howto/curses.html?highlight=user%20input> >>,-<https://docs.python.org/3.2/howto/curses.html?highlight=user%20input> >>| >>| No one has made a Windows port of the curses module. On a Windows >>| platform, try the Console module written by Fredrik Lundh. The Console >>| module provides cursor-addressable text output, plus full support for >>| mouse and keyboard input, and is available from >>| http://effbot.org/zone/console-index.htm. >> >>So you should try that instead. > > If going for curses, I'd try this instead: > http://pdcurses.sourceforge.net/ Christoph Gohlke has an extension module based on PDCurses [1]. The good news for Python 3 users is that it uses the [W]ide-character console API, such as ReadConsoleInputW. Also, its _get_key_count [2] function is designed to support the alt numpad event sequences that the system creates for the input filepath when dragging a file into the console. In my limited testing, dragging filepaths from Explorer worked without a hitch using a random Latin-1 name "¨°¸ÀÈÐØàèðø" and a Latin Extended-B name "ƠƨưƸǀLjǐǘǠǨǰǸ". Unfortunately the Python 2.7 version is linked against the [A]NSI API, which maps each Unicode character to either the closest matching character in the console's codepage or "?". Moreover the PDCurses code has a bug in narrow builds in that it returns the UnicodeChar from the KEY_EVENT_RECORD [3] instead of the AsciiChar (the name is a misnomer). In this case the high byte is junk. You can mask it out using a bitwise & with 0xFF. That said, IIRC, the OP wants to avoid using any frameworks such as curses or a GUI toolkit. [1]: http://www.lfd.uci.edu/~gohlke/pythonlibs/#curses [2]: https://github.com/wmcbrine/PDCurses/blob/PDCurses_3_4/win32/pdckbd.c#L259 [3]: https://msdn.microsoft.com/en-us/library/ms684166
[toc] | [prev] | [next] | [standalone]
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-14 17:55 -0600 |
| Message-ID | <mailman.6.1450137352.22044.python-list@python.org> |
| In reply to | #100414 |
On Mon, Dec 14, 2015 at 4:17 PM, Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote: > > ImportError: No module named pyreadline > > Is it a python 3.x module? > > I am limited to Python 2.7 pyreadline is available for 2.7-3.5 on PyPI. Anyway, I tried it to no avail. When dropping a file path into the console it ignores the alt-numpad sequences that get queued for non-ASCII characters, just like mvcrt.getwch. If you decide to roll your own getwch via ctypes or PyWin32, I suggest starting a new topic on the ctypes list or Windows list.
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-12-15 01:13 +0100 |
| Message-ID | <mailman.8.1450138408.22044.python-list@python.org> |
| In reply to | #100414 |
In a message of Mon, 14 Dec 2015 17:55:04 -0600, eryk sun writes: >On Mon, Dec 14, 2015 at 4:17 PM, Ulli Horlacher ><framstag@rus.uni-stuttgart.de> wrote: >> >> ImportError: No module named pyreadline >> >> Is it a python 3.x module? >> >> I am limited to Python 2.7 > >pyreadline is available for 2.7-3.5 on PyPI. Anyway, I tried it to no >avail. When dropping a file path into the console it ignores the >alt-numpad sequences that get queued for non-ASCII characters, just >like mvcrt.getwch. If you decide to roll your own getwch via ctypes or >PyWin32, I suggest starting a new topic on the ctypes list or Windows >list. >-- >https://mail.python.org/mailman/listinfo/python-list PyPy wrote its own pyreadline. You can get it here. https://bitbucket.org/pypy/pyrepl And see if it works any better. Laura
[toc] | [prev] | [next] | [standalone]
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-12-15 08:26 +0000 |
| Message-ID | <n4oirt$eir$1@news2.informatik.uni-stuttgart.de> |
| In reply to | #100442 |
Laura Creighton <lac@openend.se> wrote: > PyPy wrote its own pyreadline. > You can get it here. https://bitbucket.org/pypy/pyrepl As far as I can see, it has no getkey function. My users do not hit ENTER after drag&drop or copy&paste files. I need an input function with a timeout. -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de Universitaet Stuttgart Tel: ++49-711-68565868 Allmandring 30a Fax: ++49-711-682357 70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [prev] | [next] | [standalone]
| From | Laura Creighton <lac@openend.se> |
|---|---|
| Date | 2015-12-15 15:09 +0100 |
| Message-ID | <mailman.21.1450188573.22044.python-list@python.org> |
| In reply to | #100449 |
In a message of Tue, 15 Dec 2015 08:26:37 +0000, Ulli Horlacher writes: >Laura Creighton <lac@openend.se> wrote: > >> PyPy wrote its own pyreadline. >> You can get it here. https://bitbucket.org/pypy/pyrepl > >As far as I can see, it has no getkey function. >My users do not hit ENTER after drag&drop or copy&paste files. >I need an input function with a timeout. Right, then this isn't going to work. Sorry about that. Laura
[toc] | [prev] | [next] | [standalone]
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-15 09:34 -0600 |
| Message-ID | <mailman.25.1450193701.22044.python-list@python.org> |
| In reply to | #100449 |
On Tue, Dec 15, 2015 at 2:26 AM, Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote: > Laura Creighton <lac@openend.se> wrote: > >> PyPy wrote its own pyreadline. >> You can get it here. https://bitbucket.org/pypy/pyrepl > > As far as I can see, it has no getkey function. > My users do not hit ENTER after drag&drop or copy&paste files. > I need an input function with a timeout. pyreadline looked promising for its extensive ctypes implementation of the Windows console API [1], wrapped by high-level methods such as peek, getchar, and getkeypress. It turns out it ignores the event sequences you need for alt+numpad input (used when a file is dragged into the console). You'd have to modify its console and keysyms modules to make it work. It would be a useful enhancement, so probably your patches would be accepted upstream. AFAICT, pyrepl has no Windows support. Check the TODO [2]: > + port to windows [1]: https://github.com/pyreadline/pyreadline/blob/master/pyreadline/console/console.py [2]: https://bitbucket.org/pypy/pyrepl/src/62f2256014af7b74b97c00827f1a7789e00dd814/TODO?at=v0.8.4
[toc] | [prev] | [next] | [standalone]
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-12-15 17:04 +0000 |
| Message-ID | <n4ph7n$mc7$1@news2.informatik.uni-stuttgart.de> |
| In reply to | #100464 |
eryk sun <eryksun@gmail.com> wrote: > pyreadline looked promising for its extensive ctypes implementation of > the Windows console API [1], wrapped by high-level methods such as > peek, getchar, and getkeypress. It turns out it ignores the event > sequences you need for alt+numpad input (used when a file is dragged > into the console). You'd have to modify its console and keysyms > modules to make it work. It would be a useful enhancement, so probably > your patches would be accepted upstream. Ehhh... I started Python programming some weeks ago and I know nearly nothing about Windows. I am a UNIX and VMS guy :-) I am far away from delivering patches for Windows system programming. -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de Universitaet Stuttgart Tel: ++49-711-68565868 Allmandring 30a Fax: ++49-711-682357 70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [prev] | [next] | [standalone]
| From | eryk sun <eryksun@gmail.com> |
|---|---|
| Date | 2015-12-16 21:39 -0600 |
| Message-ID | <mailman.31.1450323631.30845.python-list@python.org> |
| In reply to | #100474 |
On Tue, Dec 15, 2015 at 11:04 AM, Ulli Horlacher
<framstag@rus.uni-stuttgart.de> wrote:
>
> Ehhh... I started Python programming some weeks ago and I know nearly
> nothing about Windows. I am a UNIX and VMS guy :-)
You should feel right at home, then. The Windows NT kernel was
designed and implemented by a team of former DEC engineers led by
David Cutler, who was one of the principle architects of VMS. There's
an old joke that W[indows] NT is VMS + 1. Actually, you'd probably
only notice a slight resemblance if you were coding a driver [1].
Microsoft discourages using the native NT API in user mode.
Windows client DLLs such as kernel32.dll usually implement an API
function in one of three ways, or in combination:
using the native runtime library and loader functions
(Rtl* & Ldr* in ntdll.dll)
calling system services such as
Nt* public APIs (ntdll.dll => ntoskrnl.exe)
NtUser* & NtGdi* private APIs
(user32.dll, gdi32.dll => win32k.sys)
using a local procedure call (via ALPC or a driver) to a
subsystem process such as
csrss.exe - Windows client/server runtime
conhost.exe - console host
services.exe - service control manager
lsass.exe - local security authority
smss.exe - session manager
But this is all an implementation detail. The API could be implemented
in a totally different way in a totally different environment, such as
running WINE on Linux.
[1]: http://windowsitpro.com/windows-client/windows-nt-and-vms-rest-story
[toc] | [prev] | [next] | [standalone]
| From | smap <askme.first@thankyouverymuch.invalid> |
|---|---|
| Date | 2015-12-18 21:15 +0000 |
| Message-ID | <qH_cy.70540$qj6.50232@fx44.am4> |
| In reply to | #100474 |
On Tue, 15 Dec 2015 17:04:55 +0000, Ulli Horlacher wrote: > I am a UNIX If I were you, and I had a choice, I would stay with it. Windoze is a bloody joke. A troll designed it and is probably laughing all the way to the bank. I wish there was a way to go back in time and sneakily roll a condom on Mr. Gates senior's Johnson while he was servicing the Mrs. That's how much I really HATE that so-called "OS" :(
[toc] | [prev] | [next] | [standalone]
| From | bearmingo <bearmingo@gmail.com> |
|---|---|
| Date | 2015-12-17 21:12 -0800 |
| Message-ID | <4797ae88-3341-49a9-b046-de2a31d6ad40@googlegroups.com> |
| In reply to | #100414 |
Usually I put #!-*-coding=utf-8-*- at each py file. It's ok to open file in local system.
[toc] | [prev] | [next] | [standalone]
| From | Terry Reedy <tjreedy@udel.edu> |
|---|---|
| Date | 2015-12-18 01:37 -0500 |
| Message-ID | <mailman.44.1450420807.30845.python-list@python.org> |
| In reply to | #100578 |
On 12/18/2015 12:12 AM, bearmingo wrote: > Usually I put > #!-*-coding=utf-8-*- > at each py file. > It's ok to open file in local system. That declaration only applies to the content of the file, not its name on the filesystem. -- Terry Jan Reedy
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web