Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #98979 > unrolled thread
| Started by | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| First post | 2015-11-18 16:45 +0000 |
| Last post | 2015-11-19 09:16 +0100 |
| Articles | 8 — 3 participants |
Back to article view | Back to comp.lang.python
handling of non-ASCII filenames? Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-11-18 16:45 +0000
Re: handling of non-ASCII filenames? Chris Angelico <rosuav@gmail.com> - 2015-11-19 03:54 +1100
Re: handling of non-ASCII filenames? Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-11-18 18:09 +0000
Re: handling of non-ASCII filenames? Chris Angelico <rosuav@gmail.com> - 2015-11-19 07:29 +1100
Re: handling of non-ASCII filenames? Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-11-18 22:37 +0000
Re: handling of non-ASCII filenames? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-18 18:22 +0100
Re: handling of non-ASCII filenames? Ulli Horlacher <framstag@rus.uni-stuttgart.de> - 2015-11-19 07:54 +0000
Re: handling of non-ASCII filenames? Christian Gollwitzer <auriocus@gmx.de> - 2015-11-19 09:16 +0100
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-11-18 16:45 +0000 |
| Subject | handling of non-ASCII filenames? |
| Message-ID | <n2i9vt$icf$1@news2.informatik.uni-stuttgart.de> |
I have written a program (Python 2.7) which reads a filename via
tkFileDialog.askopenfilename() (was a good hint here, other thread).
This filename may contain non-ASCII characters (German Umlauts).
In this case my program crashes with:
File "S:\python\fexit.py", line 1177, in url_encode
u += '%' + c.encode("hex").upper()
File "C:\Python27\lib\encodings\hex_codec.py", line 24, in hex_encode
output = binascii.b2a_hex(input)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0: ordinal not in range(128)
This is my encoding function:
def url_encode(s):
u = ''
for c in list(s):
if match(r'[_=:,;<>()+.\w\-]',c):
u += c
else:
u += '%' + c.encode("hex").upper()
return u
As I am Python newbie I have not quite understood the Python character
encoding scheme :-}
Where can I find a good introduction of this topic?
I would also appreciate a concrete solution for my problem :-)
--
Ullrich Horlacher Server und Virtualisierung
Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart Tel: ++49-711-68565868
Allmandring 30a Fax: ++49-711-682357
70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-11-19 03:54 +1100 |
| Message-ID | <mailman.417.1447865700.16136.python-list@python.org> |
| In reply to | #98979 |
On Thu, Nov 19, 2015 at 3:45 AM, Ulli Horlacher <framstag@rus.uni-stuttgart.de> wrote: > As I am Python newbie I have not quite understood the Python character > encoding scheme :-} > > Where can I find a good introduction of this topic? Here are a couple of articles on the basics of Unicode: http://www.joelonsoftware.com/articles/Unicode.html http://nedbatchelder.com/text/unipain.html If you can use Python 3, your life will be easier. Otherwise, you'll need to work some of this stuff out manually. ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-11-18 18:09 +0000 |
| Message-ID | <n2iete$js9$1@news2.informatik.uni-stuttgart.de> |
| In reply to | #98981 |
Chris Angelico <rosuav@gmail.com> wrote:
> > As I am Python newbie I have not quite understood the Python character
> > encoding scheme :-}
> >
> > Where can I find a good introduction of this topic?
>
> Here are a couple of articles on the basics of Unicode:
>
> http://www.joelonsoftware.com/articles/Unicode.html
I do Unicode programming for over 20 years. So, I do have a basic
understanding of it. I am just new to Python.
> http://nedbatchelder.com/text/unipain.html
Thanks. I will read it.
> If you can use Python 3
I cannot use it, because the Python compiler pyinstaller does not work
with it on Windows:
S:\python>pyinstaller.exe --onefile tk.py
Traceback (most recent call last):
File "C:\Python35\Scripts\pyinstaller-script.py", line 9, in <module>
load_entry_point('PyInstaller==3.0', 'console_scripts', 'pyinstaller')()
File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 558, in l
oad_entry_point
return get_distribution(dist).load_entry_point(group, name)
File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 2682, in
load_entry_point
return ep.load()
File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 2355, in
load
return self.resolve()
File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 2361, in
resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)
File "c:\python35\lib\site-packages\PyInstaller\__main__.py", line 21, in <mod
ule>
import PyInstaller.building.build_main
File "c:\python35\lib\site-packages\PyInstaller\building\build_main.py", line
31, in <module>
from ..depend import bindepend
File "c:\python35\lib\site-packages\PyInstaller\depend\bindepend.py", line 40,
in <module>
from ..utils.win32.winmanifest import RT_MANIFEST
File "c:\python35\lib\site-packages\PyInstaller\utils\win32\winmanifest.py", l
ine 97, in <module>
from PyInstaller.utils.win32 import winresource
File "c:\python35\lib\site-packages\PyInstaller\utils\win32\winresource.py", l
ine 20, in <module>
import pywintypes
File "c:\python35\lib\site-packages\win32\lib\pywintypes.py", line 124, in <mo
dule>
__import_pywin32_system_module__("pywintypes", globals())
File "c:\python35\lib\site-packages\win32\lib\pywintypes.py", line 64, in __im
port_pywin32_system_module__
import _win32sysloader
ImportError: DLL load failed: The specified module could not be found.
With python 2.7, pyinstaller runs without problems.
--
Ullrich Horlacher Server und Virtualisierung
Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart Tel: ++49-711-68565868
Allmandring 30a Fax: ++49-711-682357
70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [prev] | [next] | [standalone]
| From | Chris Angelico <rosuav@gmail.com> |
|---|---|
| Date | 2015-11-19 07:29 +1100 |
| Message-ID | <mailman.425.1447878573.16136.python-list@python.org> |
| In reply to | #98992 |
On Thu, Nov 19, 2015 at 5:09 AM, Ulli Horlacher
<framstag@rus.uni-stuttgart.de> wrote:
> Chris Angelico <rosuav@gmail.com> wrote:
>
>> > As I am Python newbie I have not quite understood the Python character
>> > encoding scheme :-}
>> >
>> > Where can I find a good introduction of this topic?
>>
>> Here are a couple of articles on the basics of Unicode:
>>
>> http://www.joelonsoftware.com/articles/Unicode.html
>
> I do Unicode programming for over 20 years. So, I do have a basic
> understanding of it. I am just new to Python.
Ah, okay. When you said you didn't understand the character encoding
scheme, I thought you meant Unicode itself. In any case, I make no
apology for recommending those two articles to people... at very worst
you already know that stuff :)
>> http://nedbatchelder.com/text/unipain.html
>
> Thanks. I will read it.
That's a great talk, and has the coolness of having zero pictures on
its slides. I love it.
>> If you can use Python 3
>
> I cannot use it, because the Python compiler pyinstaller does not work
> with it on Windows:
>
> S:\python>pyinstaller.exe --onefile tk.py
> Traceback (most recent call last):
> File "C:\Python35\Scripts\pyinstaller-script.py", line 9, in <module>
> load_entry_point('PyInstaller==3.0', 'console_scripts', 'pyinstaller')()
> File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 558, in l
> oad_entry_point
> return get_distribution(dist).load_entry_point(group, name)
> File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 2682, in
> load_entry_point
> return ep.load()
> File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 2355, in
> load
> return self.resolve()
> File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 2361, in
> resolve
> module = __import__(self.module_name, fromlist=['__name__'], level=0)
> File "c:\python35\lib\site-packages\PyInstaller\__main__.py", line 21, in <mod
> ule>
> import PyInstaller.building.build_main
> File "c:\python35\lib\site-packages\PyInstaller\building\build_main.py", line
> 31, in <module>
> from ..depend import bindepend
> File "c:\python35\lib\site-packages\PyInstaller\depend\bindepend.py", line 40,
> in <module>
> from ..utils.win32.winmanifest import RT_MANIFEST
> File "c:\python35\lib\site-packages\PyInstaller\utils\win32\winmanifest.py", l
> ine 97, in <module>
> from PyInstaller.utils.win32 import winresource
> File "c:\python35\lib\site-packages\PyInstaller\utils\win32\winresource.py", l
> ine 20, in <module>
> import pywintypes
> File "c:\python35\lib\site-packages\win32\lib\pywintypes.py", line 124, in <mo
> dule>
> __import_pywin32_system_module__("pywintypes", globals())
> File "c:\python35\lib\site-packages\win32\lib\pywintypes.py", line 64, in __im
> port_pywin32_system_module__
> import _win32sysloader
> ImportError: DLL load failed: The specified module could not be found.
>
>
> With python 2.7, pyinstaller runs without problems.
Hmm. That's a separate consideration. If you take a step back and ask
the broader question "How can I package my Python 3.5 script into a
.exe file?", I'm pretty sure there is an answer; maybe pyinstaller
isn't the way to do it. (I'm not an expert on exe file production.)
Does your script run happily in 3.5 if it isn't packaged into an exe?
ChrisA
[toc] | [prev] | [next] | [standalone]
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-11-18 22:37 +0000 |
| Message-ID | <n2iuk7$o1n$1@news2.informatik.uni-stuttgart.de> |
| In reply to | #98997 |
Chris Angelico <rosuav@gmail.com> wrote:
> >> If you can use Python 3
> >
> > I cannot use it, because the Python compiler pyinstaller does not work
> > with it on Windows:
> >
> > S:\python>pyinstaller.exe --onefile tk.py
> > Traceback (most recent call last):
> > File "C:\Python35\Scripts\pyinstaller-script.py", line 9, in <module>
> > load_entry_point('PyInstaller==3.0', 'console_scripts', 'pyinstaller')()
> > File "c:\python35\lib\site-packages\pkg_resources\__init__.py", line 558, in l
> > oad_entry_point
(...)
> >
> > With python 2.7, pyinstaller runs without problems.
>
> Hmm. That's a separate consideration. If you take a step back and ask
> the broader question "How can I package my Python 3.5 script into a
> .exe file?", I'm pretty sure there is an answer; maybe pyinstaller
> isn't the way to do it. (I'm not an expert on exe file production.)
pyinstaller works without any problems with Python 2.7.10
> Does your script run happily in 3.5 if it isn't packaged into an exe?
Yes.
--
Ullrich Horlacher Server und Virtualisierung
Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart Tel: ++49-711-68565868
Allmandring 30a Fax: ++49-711-682357
70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [prev] | [next] | [standalone]
| From | Christian Gollwitzer <auriocus@gmx.de> |
|---|---|
| Date | 2015-11-18 18:22 +0100 |
| Message-ID | <n2ibvh$5th$1@dont-email.me> |
| In reply to | #98979 |
Am 18.11.15 um 17:45 schrieb Ulli Horlacher:
> This is my encoding function:
>
> def url_encode(s):
> u = ''
> for c in list(s):
> if match(r'[_=:,;<>()+.\w\-]',c):
> u += c
> else:
> u += '%' + c.encode("hex").upper()
> return u
>
>
The quoting is applied to a UTF8 string. But I think you shouldn't do it
yourself, use a library function:
import urllib
urllib.quote(yourstring.encode('utf8'))
Christian
[toc] | [prev] | [next] | [standalone]
| From | Ulli Horlacher <framstag@rus.uni-stuttgart.de> |
|---|---|
| Date | 2015-11-19 07:54 +0000 |
| Message-ID | <n2jv7k$14p$2@news2.informatik.uni-stuttgart.de> |
| In reply to | #98986 |
Christian Gollwitzer <auriocus@gmx.de> wrote:
> Am 18.11.15 um 17:45 schrieb Ulli Horlacher:
> > This is my encoding function:
> >
> > def url_encode(s):
> > u = ''
> > for c in list(s):
> > if match(r'[_=:,;<>()+.\w\-]',c):
> > u += c
> > else:
> > u += '%' + c.encode("hex").upper()
> > return u
> >
> >
>
> The quoting is applied to a UTF8 string.
encode("hex") works only with binary strings?
How do I convert a UTF8 string to binary?
> But I think you shouldn't do it yourself, use a library function:
>
> import urllib
> urllib.quote(yourstring.encode('utf8'))
It does not encode exactly the same way I need it.
Besides this, I want to understand how Python handles strings and
character encoding.
--
Ullrich Horlacher Server und Virtualisierung
Rechenzentrum IZUS/TIK E-Mail: horlacher@tik.uni-stuttgart.de
Universitaet Stuttgart Tel: ++49-711-68565868
Allmandring 30a Fax: ++49-711-682357
70550 Stuttgart (Germany) WWW: http://www.tik.uni-stuttgart.de/
[toc] | [prev] | [next] | [standalone]
| From | Christian Gollwitzer <auriocus@gmx.de> |
|---|---|
| Date | 2015-11-19 09:16 +0100 |
| Message-ID | <n2k0c3$cjn$2@dont-email.me> |
| In reply to | #99036 |
Am 19.11.15 um 08:54 schrieb Ulli Horlacher:
> Christian Gollwitzer <auriocus@gmx.de> wrote:
>> Am 18.11.15 um 17:45 schrieb Ulli Horlacher:
>>> This is my encoding function:
>>>
>>> def url_encode(s):
>>> u = ''
>>> for c in list(s):
>>> if match(r'[_=:,;<>()+.\w\-]',c):
>>> u += c
>>> else:
>>> u += '%' + c.encode("hex").upper()
>>> return u
>>>
>>>
>>
>> The quoting is applied to a UTF8 string.
>
> encode("hex") works only with binary strings?
> How do I convert a UTF8 string to binary?
It's right in the other line I showed you:
Apfelkiste:Tests chris$ python
Python 2.7.2 (default, Oct 11 2012, 20:14:37)
[GCC 4.2.1 Compatible Apple Clang 4.0 (tags/Apple/clang-418.0.60)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> u"blablüa".encode('utf8')
'blabl\xc3\xbca'
>>>
Christian
[toc] | [prev] | [standalone]
Back to top | Article view | comp.lang.python
csiph-web