Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Terry Reedy Newsgroups: comp.lang.python Subject: Re: Unicode failure Date: Sat, 5 Dec 2015 17:03:30 -0500 Lines: 33 Message-ID: References: <20151204130738.76313c43@imp> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de RgRtK9inW9LZ6fCgHbNT7QLuHcwO2XrbN7rjriz8pIAA== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.001 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'exception.': 0.07; 'utf-8': 0.07; '1200,': 0.09; 'assumed': 0.09; 'received:80.91': 0.09; 'received:80.91.229': 0.09; 'received:gmane.org': 0.09; 'received:list': 0.09; 'jan': 0.11; 'claims,': 0.16; 'idle,': 0.16; 'received:80.91.229.3': 0.16; 'received:io': 0.16; 'received:plane.gmane.org': 0.16; 'received:psf.io': 0.16; 'reedy': 0.16; 'subject:Unicode': 0.16; 'terribly': 0.16; "tk's": 0.16; 'wrote:': 0.16; '1200': 0.18; 'pointed': 0.18; '>>>': 0.20; 'meant': 0.22; 'sorry,': 0.22; 'header:In-Reply-To:1': 0.24; 'header:User-Agent:1': 0.26; 'header:X-Complaints-To:1': 0.26; 'rest': 0.26; 'supported': 0.27; 'currently,': 0.29; 'print': 0.30; 'system,': 0.30; 'that.': 0.30; 'everyone': 0.31; 'idle': 0.33; 'running': 0.34; 'list': 0.34; 'gives': 0.35; 'unicode': 0.35; 'but': 0.36; 'assigned': 0.36; 'received:71': 0.36; 'to:addr :python-list': 0.36; 'pm,': 0.36; 'subject:: ': 0.37; 'received:org': 0.37; 'anything': 0.38; 'someone': 0.38; 'test': 0.39; 'system.': 0.39; 'to:addr:python.org': 0.40; 'box.': 0.66; 'results': 0.66; 'useful.': 0.72; 'console,': 0.84; 'received:fios.verizon.net': 0.91 X-Injected-Via-Gmane: http://gmane.org/ X-Gmane-NNTP-Posting-Host: pool-71-185-227-36.phlapa.fios.verizon.net User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:100048 On 12/5/2015 2:44 PM, Random832 wrote: > On 2015-12-05, Terry Reedy wrote: >> On 12/4/2015 10:22 PM, Random832 wrote: >>> Well, any bar 1200, 1201, 12000, 12001, 65000, 65001, and 54936. >> >> Test before you post. > > As someone else pointed out, I meant that as a list of codepages > which support all Unicode codepoints, not a list of codepoints > not supported by Tk's UCS-2. Sorry, I assumed everyone knew > offhand that 65001 was UTF-8 So Microsoft claims, but it is not terribly useful. Currently, on my Win 10 system, 'chcp 65001' results in sys.stdout.encoding = 'cp65001', and for cp in 1200, 1201, 12000, 12001, 65000, 65001, 54936: print(chr(cp)) running without the usual exception. But of the above numbers mis-interpreted as codepoints, only 1200 and 1201 print anything other than a box with ?, whereas IDLE printed 3 other chars for 3 other assigned codepoints. If I change the console font to Lucida Console, which I use in IDLE, even chr(1200) gives a box. > and would infer that the rest were for other UTF encodings. After re-reading, I see how I might have inferred that. Anyway, the OP found the solution for his system. -- Terry Jan Reedy