Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #24795

Re: helping with unicode

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail
Return-Path <bahamutzero8825@gmail.com>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.001
X-Spam-Evidence '*H*': 1.00; '*S*': 0.00; 'broken': 0.03; 'cpython': 0.05; 'pages.': 0.07; 'utf-8': 0.07; 'works.': 0.07; 'python': 0.09; 'encode': 0.09; 'sys.stdout': 0.09; 'portion': 0.13; 'cli': 0.16; 'cmd,': 0.16; 'codec': 0.16; 'subject:unicode': 0.16; 'windows).': 0.16; 'wrote:': 0.17; 'windows': 0.19; 'versions': 0.20; 'equivalent': 0.20; 'received:209.85.214.174': 0.21; 'fine,': 0.22; 'idea': 0.24; 'header:In-Reply-To:1': 0.25; 'header :User-Agent:1': 0.26; '(which': 0.26; '(most': 0.27; "doesn't": 0.28; 'fine': 0.28; 'character': 0.29; 'basic': 0.30; 'code': 0.31; 'file': 0.32; 'not.': 0.32; 'could': 0.32; 'print': 0.32; 'idle': 0.33; 'traceback': 0.33; 'utility': 0.33; 'problem': 0.33; 'to:addr:python-list': 0.33; "can't": 0.34; 'received:google.com': 0.34; 'sequence': 0.35; 'pm,': 0.35; 'received:209.85': 0.35; 'but': 0.36; 'message-id:@gmail.com': 0.36; 'subject:with': 0.36; 'display': 0.36; 'does': 0.37; 'why': 0.37; 'previous': 0.37; 'received:209': 0.37; 'subject:: ': 0.38; 'supports': 0.38; 'to:addr:python.org': 0.39; 'received:209.85.214': 0.39; 'received:192': 0.39; 'skip:" 10': 0.40; 'received:192.168': 0.40; 'header:Received:5': 0.40; 'your': 0.60; 'skip:u 10': 0.60; 'skip:6 10': 0.63
DKIM-Signature v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; bh=gchj7elGTmKwNkSXVoBGN1KdJGSHDZZ1S6vD3ZaUu7o=; b=nDoRVBeYbS7NGySF+ovu1dU7eKg+oKQQ9xKl48gmcuRGFnua0WefR3vh+dpVHaracL Gp5A0odqNFenanyOhmYf1ikcWMdevy0lNSKmCsuLJSjp+HZZWs4xvMpWF2xpq/VcYQmO fwwMbAXz9U3Pw0GB7H4fmNZ61EF3xtHMrHI0phdV3BnyNIblAg1Djypt1bZHrborwLlo jN+cl1Yot4D9nmoCxUBhDC1hQqZEVmdT8Bmn6ZqVPn4P/+sO5+hd4l8kKEg4nkBvsaY8 kKXng5gSrM/jZ0+2hswGrPIuBguqarYEaW5qXaK4zkepNNCD68g4TadoocoJIQ51ZcEU L1CA==
Date Mon, 02 Jul 2012 20:14:24 -0500
From Andrew Berg <bahamutzero8825@gmail.com>
User-Agent Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120614 Thunderbird/13.0.1
MIME-Version 1.0
To "comp.lang.python" <python-list@python.org>
Subject Re: helping with unicode
References <56e3cafd-ec4f-4ae4-ad6c-685f2d991403@googlegroups.com>
In-Reply-To <56e3cafd-ec4f-4ae4-ad6c-685f2d991403@googlegroups.com>
X-Enigmail-Version 1.4.2
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 7bit
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.12
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <http://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <http://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.1723.1341278107.4697.python-list@python.org> (permalink)
Lines 20
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1341278107 news.xs4all.nl 6966 [2001:888:2000:d::a6]:37039
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:24795

Show key headers only | View raw


On 7/2/2012 7:49 PM, self.python wrote:
> ----------------------------------------------------------------
> Traceback (most recent call last):
>   File "C:wrong.py", line 8, in <module>
>     print rf.read().decode('utf-8')
> UnicodeEncodeError: 'cp949' codec can't encode character u'u1368' in position 5
> 5122: illegal multibyte sequence
> ---------------------------------------------------------------------
> 
> cp949 is the basic codec of sys.stdout and cmd.exe  
> but I have no idea why it doesn't works.
> printing without decode('utf-8') works fine on IDLE but on cmd, it print broken characters(Ascii portion is still fine, problem is only about the Korean)
Your terminal can't display those characters. You could try using other
code pages with chcp (a CLI utility that is part of Windows). IDLE is a
GUI, so it does not have to work with code pages.

Python 3.3 supports cp65001 (which is the equivalent of UTF-8 for
Windows terminals), but unfortunately, previous versions do not.
-- 
CPython 3.3.0a4 | Windows NT 6.1.7601.17803

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

helping with unicode "self.python" <howmuchistoday@gmail.com> - 2012-07-02 17:49 -0700
  Re: helping with unicode Andrew Berg <bahamutzero8825@gmail.com> - 2012-07-02 20:14 -0500
  Re: helping with unicode MRAB <python@mrabarnett.plus.com> - 2012-07-03 02:21 +0100
  Re: helping with unicode Terry Reedy <tjreedy@udel.edu> - 2012-07-02 21:39 -0400
  Re: helping with unicode Terry Reedy <tjreedy@udel.edu> - 2012-07-02 21:39 -0400

csiph-web