Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!proxad.net!feeder1-2.proxad.net!news.tele.dk!news.tele.dk!small.news.tele.dk!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'win32': 0.03; 'subject:IDLE': 0.04; 'subject:Python': 0.06; '(b)': 0.07; '-*-': 0.07; 'encoded': 0.07; 'interpreter.': 0.07; 'utf-8': 0.07; 'coding:': 0.09; 'cookie': 0.09; 'subject:2.7': 0.09; 'subject:characters': 0.09; 'python': 0.11; 'jan': 0.12; 'characters:': 0.16; 'editor,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'idle,': 0.16; 'literal.': 0.16; 'tab,': 0.16; 'bit': 0.19; 'pointed': 0.19; 'seems': 0.21; '>>>': 0.22; 'coding': 0.22; 'to:name:python-list@python.org': 0.22; '(a)': 0.24; 'unicode': 0.24; 'source': 0.25; 'this:': 0.26; 'least': 0.26; 'fixed': 0.29; 'url:bugs': 0.29; '2009': 0.29; 'character': 0.29; "doesn't": 0.30; 'message-id:@mail.gmail.com': 0.30; 'skip:( 20': 0.30; 'dated': 0.31; 'fine,': 0.31; 'subject:- ': 0.31; 'url:python': 0.33; 'beginning': 0.33; 'noticed': 0.34; 'updated': 0.34; 'but': 0.35; 'received:google.com': 0.35; 'idle': 0.36; 'url:org': 0.36; 'being': 0.38; 'to:addr:python-list': 0.38; 'issue': 0.38; 'does': 0.39; 'to:addr:python.org': 0.39; 'how': 0.40; 'first': 0.61; 'more': 0.64; 'talking': 0.65; 'mar': 0.68; 'default': 0.69; 'led': 0.72; '2014,': 0.84; 'execution.': 0.84; 'suspicion': 0.84; '\xe2\x82\xac': 0.84; '2013,': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=S7LguNbuT7i/+dNjJoCWUEyImn52XmQbuaIkKFe4/m0=; b=c1qvTIjCgIcMgGvpWzyldp7F7t2dlfPvDxct38yRsUlUyTJmSTC+EREZnB297GF4kI SLqlgRa/NH0aVPnjcCPbikpsBCnlo+EjwfjOTqKihdqpd7fUdfsu5YQIBFPqreZR4+Ya WU8fyORHC11ecL8PFJoetdxTt2egNQHF2d8pGcwXFzdVmPHi6ZGJ0Qn0wbiGT+fMnfBd 9uJimVZjDeLWvpz9v91b2gAsr5xqnx3nxaSyiu0EJKu3GnU648oGBO6i9OcN9cGXjsbZ bD+7oYHDoIVtxFvLHfu+KFWXmkjO9zqwtiUVZxQ5bfVYrFvIQcS82SmYwUj2BYqFzxdM VMxQ== MIME-Version: 1.0 X-Received: by 10.194.23.8 with SMTP id i8mr12054338wjf.104.1408405483376; Mon, 18 Aug 2014 16:44:43 -0700 (PDT) Date: Tue, 19 Aug 2014 09:44:43 +1000 Subject: Python 2.7 IDLE Win32 interactive, pasted characters i- wrong encoding From: Chris Angelico To: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 38 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1408405485 news.xs4all.nl 2913 [2001:888:2000:d::a6]:38033 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:76525 Python 3 works fine, at least for BMP characters: Python 3.4.0 (v3.4.0:04f714765c13, Mar 16 2014, 19:24:06) [MSC v.1600 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> u"U+20AC is =E2=82=AC is 0x80 in CP-1252" 'U+20AC is =E2=82=AC is 0x80 in CP-1252' >>> ascii(_) "'U+20AC is \\u20ac is 0x80 in CP-1252'" Python 2 doesn't: Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32 Type "copyright", "credits" or "license()" for more information. >>> # -*- coding: utf-8 -*- >>> u"U+20AC is =E2=82=AC is 0x80 in CP-1252" u'U+20AC is \x80 is 0x80 in CP-1252' The pasted-in character is encoded CP-1252 instead of being a Unicode literal. Beginning the session with the coding cookie doesn't make any difference; nor does the Options|Configure IDLE, General tab, Default Source Encoding, which I have set to UTF-8. My suspicion is that both of these will work for editing files, but not for interactive execution. Poking around led me to this: http://bugs.python.org/issue4454 which pointed me to http://bugs.python.org/issue4008 but (a) that claims to have been fixed in Jan 2009 (I first noticed this issue in 2.7.4 dated 2013, and then I updated to 2.7.8 in case it had been fixed), and (b) it seems to be talking about the editor, not the interactive interpreter. How do I get IDLE to accept Unicode in literals? ChrisA