Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.mixmin.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed4a.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'win32': 0.03; 'subject:IDLE': 0.04; 'encoding': 0.05; 'subject:Python': 0.06; '-*-': 0.07; 'front-end': 0.07; 'utf-8': 0.07; 'agree,': 0.09; 'back-end': 0.09; 'coding:': 0.09; 'exec': 0.09; 'executed': 0.09; 'subject:2.7': 0.09; 'subject:characters': 0.09; 'subset': 0.09; 'cc:addr:python-list': 0.11; 'python': 0.11; 'defaulting': 0.16; 'executor': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'idle,': 0.16; 'idle.': 0.16; 'iteratively': 0.16; 'mode,': 0.16; 'obviously,': 0.16; 'reedy': 0.16; 'win7': 0.16; 'wrote:': 0.18; 'bit': 0.19; 'aug': 0.22; 'coding': 0.22; 'saying': 0.22; 'cc:addr:python.org': 0.22; 'unicode': 0.24; 'together.': 0.24; 'cc:2**0': 0.24; 'compiled': 0.26; 'header:In- Reply-To:1': 0.27; 'chris': 0.29; "doesn't": 0.30; 'statement': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'subject:- ': 0.31; 'probably': 0.32; 'stuff': 0.32; 'text': 0.33; 'actual': 0.34; "i'd": 0.34; 'problem': 0.35; 'problem.': 0.35; 'something': 0.35; 'editor': 0.35; 'test': 0.35; 'but': 0.35; 'received:google.com': 0.35; 'there': 0.35; 'idle': 0.36; 'done': 0.36; "didn't": 0.36; 'should': 0.36; 'two': 0.37; 'pm,': 0.38; 'rather': 0.38; 'most': 0.60; 'more': 0.64; 'great': 0.65; 'between': 0.67; 'anything.': 0.68; 'covers': 0.68; 'euro': 0.69; "how's": 0.74; 'obvious': 0.74; '2014,': 0.84; 'characters,': 0.84; "it'd": 0.84; 'lasting': 0.84; 'shows,': 0.84; '\xe2\x82\xac': 0.84; 'to:none': 0.92 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type:content-transfer-encoding; bh=9FFgrZ4Js7ovcsja9aU4pYYW1xECdiIpPFoM8D7nm7c=; b=BW7TqtPCBQzt57sDussO89y/xQPvVxsVhmC8iLPebStcoLdMpit7GCnOcG99CObyRI sV94Mi53FgZsZDdgCFy4KJmYyBiV/W5kgqFWWXDxVvgbAlaVvilqU3z59b1KnCW9/naO 64l2+mqWn2VUnCUusrVzDp+7yli7X6AEBFg3fxwxooRIeOmWRWCLfjsUOA49Q/NjRQtW OVFvfeFWQZZzRXrs5dqM+ORdm6z5NldSFBANdwsex6KPbJweDb3KfjIW/VmgaB54yBCA qd2Sj3njhW5WkKdYkg/y3FOb3WAWOMyPR4hN+jAO4+jsn/9DB4R1j9fCs/cOhQrwW1xP KHeA== MIME-Version: 1.0 X-Received: by 10.43.167.196 with SMTP id nf4mr40259746icc.22.1408440190808; Tue, 19 Aug 2014 02:23:10 -0700 (PDT) In-Reply-To: References: Date: Tue, 19 Aug 2014 19:23:10 +1000 Subject: Re: Python 2.7 IDLE Win32 interactive, pasted characters i- wrong encoding From: Chris Angelico Cc: "python-list@python.org" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 48 NNTP-Posting-Host: 2001:888:2000:d::a6 X-Trace: 1408440668 news.xs4all.nl 2881 [2001:888:2000:d::a6]:55170 X-Complaints-To: abuse@xs4all.nl Xref: csiph.com comp.lang.python:76550 On Tue, Aug 19, 2014 at 7:03 PM, Terry Reedy wrote: > On 8/18/2014 7:44 PM, Chris Angelico wrote: >> Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit >> (Intel)] on win32 >> Type "copyright", "credits" or "license()" for more information. >>>>> >>>>> # -*- coding: utf-8 -*- > > > I don't think this has any lasting effect in interactive mode. Each > statement is compiled and executed separatedly. In Idle, this is done wit= h > exec(). I didn't think it would, but wanted to cover all obvious possibilities before saying anything. >>>>> u"U+20AC is =E2=82=AC is 0x80 in CP-1252" >> >> u'U+20AC is \x80 is 0x80 in CP-1252' > > > Better than what I get on my 3.4.1 Win7 > > U+20AC is =E2=82=AC is 0x80 in CP-1252 How's \x80 better than an actual euro sign? (As ascii() shows, it's coming through as \u20ac, which is correct.) > The problem is python exec, not Idle. Use the editor and submit coding an= d > code together. There are two parts to the problem. Idle is interpreting my pasted text as CP-1252, but then the exec under the covers is defaulting to something else (Latin-1?). If exec were told to use 1252, then I think this would probably work (maybe?) - it'd be limited to a subset of Unicode characters, but it'd function. In theory, the encoding used between Idle's front-end and the back-end executor should be an implementation detail, nothing more - as long as they agree, all's well. I like interactive mode, it's a great way to test stuff quickly. And obviously, I do most of my work in 3.4, not 2.7, so this problem doesn't hit me. But when I need to test Py2 for something, I'd rather work interactively and iteratively than with the editor mode. ChrisA