Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #33679

Re: Encoding conundrum

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!feeder.erje.net!eu.feeder.erje.net!newsfeed.xs4all.nl!newsfeed4.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <d@davea.name>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.000
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'python.': 0.02; 'one?': 0.05; 'ascii': 0.07; 'character,': 0.07; 'locale': 0.07; 'utf-8': 0.07; 'python': 0.09; 'anymore.': 0.09; 'codecs': 0.09; 'encode': 0.09; 'encoding.': 0.09; 'cc:addr:python-list': 0.10; ':-)': 0.13; 'encoding': 0.15; 'codec': 0.16; 'cp1252': 0.16; 'encodings,': 0.16; 'enlighten': 0.16; 'fine.': 0.16; 'quirks': 0.16; 'range.': 0.16; 'statement.': 0.16; 'wrote:': 0.17; 'byte': 0.17; 'unicode': 0.17; 'appears': 0.18; 'windows': 0.19; 'skip:" 30': 0.20; 'bit': 0.21; 'import': 0.21; 'default,': 0.22; 'ones.': 0.22; 'cc:2**0': 0.23; "haven't": 0.23; "i've": 0.23; 'cc:no real name:2**0': 0.24; 'device': 0.24; 'linux': 0.24; 'command': 0.24; 'script': 0.24; 'cc:addr:python.org': 0.25; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; 'handling': 0.27; 'set.': 0.27; '>>>>': 0.29; 'about.': 0.29; 'restricted': 0.29; 'character': 0.29; 'convert': 0.29; "i'm": 0.29; 'daniel': 0.30; 'error': 0.30; 'file': 0.32; 'running': 0.32; 'says': 0.33; 'handle': 0.33; "can't": 0.34; 'done': 0.34; 'especially': 0.35; 'pm,': 0.35; "won't": 0.35; 'there': 0.35; 'but': 0.36; 'characters': 0.36; 'should': 0.36; 'problems': 0.36; 'display': 0.36; 'why': 0.37; 'data': 0.37; 'subject:: ': 0.38; 'mean': 0.38; 'skip:l 20': 0.38; 'some': 0.38; 'received:192': 0.39; 'application': 0.40; 'where': 0.40; 'received:192.168': 0.40; 'skip:u 10': 0.60; 'between': 0.63; 'assistance': 0.63; 'different': 0.63; 'klein': 0.65; 'skip:c 50': 0.66; 'talking': 0.66; 'header:Reply-To:1': 0.68; 'received:74.208': 0.71; 'reply-to:no real name:2**0': 0.72; 'special': 0.73; 'received:74.208.4.194': 0.84; "they'd": 0.84; '8bit': 0.91
Date Tue, 20 Nov 2012 17:46:55 -0500
From Dave Angel <d@davea.name>
User-Agent Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121011 Thunderbird/16.0.1
MIME-Version 1.0
To Daniel Klein <danielkleinad@gmail.com>
Subject Re: Encoding conundrum
References <CADNxFdMV857vQkHy9+kfF=dA5hOX+aNNNqXLMtiJzT_deXk66A@mail.gmail.com>
In-Reply-To <CADNxFdMV857vQkHy9+kfF=dA5hOX+aNNNqXLMtiJzT_deXk66A@mail.gmail.com>
Content-Type text/plain; charset=ISO-8859-1
Content-Transfer-Encoding 7bit
X-Provags-ID V02:K0:nAKEmYoXQobxzw5zggJqSIiopR2fQmJIO5JKG8L/ewT /EMzVgWmta+qVHeEnOwzQZA0e6EXLgs4mmgltvr795cxMReRFT Q35VuzTQS8mAQ+2Xrq6EtAGTehDsks7ITbfhwjKsMrT3VMRUJ3 UUA5atPFOIS0QppFUR0FTqks+ULl4Cv2+upsJPSZzwFS1WsqaH 3qvZ6LDNvBTwNOKdcqa0hv1xM5jzCnVklIk4XbirkCPJxSiLcC iZeGWHiF41u7I1W9aEWC7vqOchHlncnXRtiewpJaU+BCTkWYfv kjdlGWasuD/la2S52G6onlTF+OuI+Z9UW0pxBoZTBfxNE6KPQ= =
Cc python-list@python.org
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
Reply-To d@davea.name
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.113.1353451637.29569.python-list@python.org> (permalink)
Lines 66
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1353451637 news.xs4all.nl 6902 [2001:888:2000:d::a6]:44039
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:33679

Show key headers only | View raw


On 11/20/2012 04:49 PM, Daniel Klein wrote:
> With the assistance of this group I am understanding unicode encoding
> issues much better; especially when handling special characters that are
> outside of the ASCII range. I've got my application working perfectly now
> :-)
>
> However, I am still confused as to why I can only use one specific encoding.

Who says you can only use one?  You need to use the right encoding for
the device or file you're talking with, and if different devices want
different encodings, then you must use multiple ones.  Only one can be
the default, however, and that's where some problems come about.

>
> I've done some research and it appears that I should be able to use any of
> the following codecs with codepoints '\xfc' (chr(252)) '\xfd' (chr(253))
> and '\xfe' (chr(254)) :
>
> ISO-8859-1   [ note that I'm using this codec on my Linux box ]
> cp1252
> cp437
> latin1
> utf-8
>
> If I'm not mistaken, all of these codecs can handle the complete 8bit
> character set.

What 8 bit character set?  This is a nonsense statement.  If you mean
all of them can convert an 8 bit byte to SOME unicode character, then
fine.  But they won't convert each such byte to the SAME unicode
character, or they'd be the same encoding.


> However, on Windows 7, I am only able to use 'cp437' to display (print)
> data with those characters in Python. If I use any other encoding, Windows
> laughs at me with this error message:
>
>   File "C:\Python33\lib\encodings\cp437.py", line 19, in encode
>     return codecs.charmap_encode(input,self.errors,encoding_map)[0]
> UnicodeEncodeError: 'charmap' codec can't encode character '\xfd' in
> position 3: character maps to <undefined>
>
> Furthermore I get this from IDLE:
>
>>>> import locale
>>>> locale.getdefaultlocale()
> ('en_US', 'cp1252')
>
> I also get 'cp1252' when running the same script from a Windows command
> prompt.
>
> So there is a contradiction between the error message and the default
> encoding.
>
> Why am I restricted from using just that one codec? Is this a Windows or
> Python restriction? Please enlighten me.
>
>
>
I don't know much about Windows quirks anymore.  I haven't had to use it
much for years.

-- 

DaveA

Back to comp.lang.python | Previous | Next | Find similar | Unroll thread


Thread

Re: Encoding conundrum Dave Angel <d@davea.name> - 2012-11-20 17:46 -0500

csiph-web