Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!aioe.org!feeder.news-service.com!newsfeed.xs4all.nl!newsfeed6.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'responding': 0.04; 'subject:Python': 0.06; 'bytes.': 0.07; 'function,': 0.07; "shouldn't": 0.07; 'python': 0.08; 'bug,': 0.09; 'coding.': 0.09; 'googled': 0.09; 'interpreter.': 0.09; 'problem:': 0.09; 'run.': 0.09; 'url:dev': 0.09; 'url:peps': 0.09; 'am,': 0.14; 'received:209.85.214.174': 0.14; 'received:mail- iw0-f174.google.com': 0.14; 'wrote:': 0.14; 'angelico': 0.16; 'codec': 0.16; 'encode': 0.16; 'expected,': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16; 'guessing': 0.16; 'known.': 0.16; 'komodo': 0.16; 'lengths': 0.16; 'parsed': 0.16; 'subject:() ': 0.16; 'subject:function': 0.16; 'url:pep-0263': 0.16; 'traceback': 0.16; '(most': 0.16; 'tue,': 0.17; 'help.': 0.20; 'header:In-Reply-To:1': 0.21; '(like': 0.21; 'support,': 0.21; 'last):': 0.23; 'parse': 0.23; 'asked': 0.24; "doesn't": 0.25; 'suspect': 0.25; 'string': 0.26; 'windows': 0.26; 'script': 0.27; "i'm": 0.27; 'message-id:@mail.gmail.com': 0.28; 'problem': 0.28; 'received:209.85.214': 0.28; 'character': 0.29; 'skip:" 30': 0.29; '24,': 0.29; 'attached.': 0.29; 'unicode': 0.29; 'unable': 0.30; 'fairly': 0.30; 'ran': 0.30; 'anyone': 0.32; "can't": 0.32; 'agree': 0.32; 'maps': 0.32; 'to:addr:python-list': 0.33; "i'll": 0.34; 'install': 0.34; 'chris': 0.34; 'source': 0.34; 'file': 0.34; 'showing': 0.34; 'thank': 0.35; 'there': 0.35; 'duplicate': 0.35; 'file:': 0.35; 'occurs': 0.35; 'trigger': 0.35; 'using': 0.35; 'received:google.com': 0.37; 'issue': 0.37; 'something': 0.37; 'received:209.85': 0.37; 'skip:z 20': 0.37; 'case': 0.37; 'put': 0.37; 'url:python': 0.38; 'could': 0.38; 'hello,': 0.38; 'problem.': 0.38; 'url:org': 0.38; 'but': 0.38; 'subject:: ': 0.38; 'subject: (': 0.39; 'should': 0.39; 'received:209': 0.39; 'system.': 0.39; 'editor': 0.39; 'got': 0.39; 'either': 0.39; 'to:addr:python.org': 0.39; 'software': 0.40; 'plain': 0.40; 'addition': 0.60; 'kind': 0.60; 'hope': 0.60; 'your': 0.60; 'address': 0.62; 'matter': 0.63; 'note:': 0.63; 'enough,': 0.65; 'exact': 0.65; 'here': 0.66; '11,': 0.68; 'here.': 0.69; 'details.': 0.69; 'address,': 0.71; 'guarantee': 0.72; 'saving': 0.74; 'spot': 0.76; 'encoding,': 0.84; 'marker': 0.84; 'xp,': 0.84 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=F08pboJ485Fb+3mCyfWS+Hej+3btvLdUUm05DXbKbJ0=; b=sx9tNohJaozzdA/qEaFNBLru5IhqzE6sMraFUmbtJF4xNsAFQ0XLj+WKsNeeBBkpsm jz3PAbslEy75QZm1qsFx/2161TelqkQS5LLzEFYsVLg7/ZPwjm1PRZF60r85eN61tVwf FTLAYCj4spJbAC5pjyMtIL9MBxjs+Tu1k4nYo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=mHYprw8lmFAwbG47J0/lJEccOkaFFgWwUdegMjeI5Hxm557aKsDUa06hmpTEnhJTwf lKrEnkjp4J1qiUv2zeWkVDo4w0PdcsVTRZ9EsQG8hIhQHIHZmJseTRC1cf0LHwGSU0+o SUlj+p3QOz/w7RQHbz0wDPMGkaDIGihT8c+fo= MIME-Version: 1.0 In-Reply-To: References: Date: Tue, 24 May 2011 08:42:14 +1000 Subject: Re: Strange behaviour of input() function (Python 3.2) From: Chris Angelico To: python-list@python.org Content-Type: text/plain; charset=ISO-8859-1 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Newsgroups: comp.lang.python Message-ID: Lines: 67 NNTP-Posting-Host: 82.94.164.166 X-Trace: 1306190540 news.xs4all.nl 49174 [::ffff:82.94.164.166]:51041 X-Complaints-To: abuse@xs4all.nl Xref: x330-a1.tempe.blueboxinc.net comp.lang.python:6100 On Tue, May 24, 2011 at 1:44 AM, Aleksander Pietkiewicz wrote: > Hello, > I have googled your email address, I hope it is not a problem. > Thank you for your help! I figured you would get it from my post, but either way works! My email address is fairly well known. Sorry for the delay in response; you caught me while I was asleep. :) I'm now responding on-list so that other people can help. > I agree that can be very specific bug, I suspect it is matter of coding. I'm > emailing you a *.py file as you asked and screenshot showing script being > run. Unfortunately my Windows install doesn't have internationalization support, which may be an issue here. I ran your 'couting.py' and got errors back: Traceback (most recent call last): File "foo.py", line 11, in n=input("Naci\u015bnij Enter aby zako\u0144czy\u0107...") File "C:\python32\lib\encodings\cp437.py", line 12, in encode return codecs.charmap_encode(input,errors,encoding_map) UnicodeEncodeError: 'charmap' codec can't encode character '\u015b' in position 4: character maps to So I'm guessing that codepage 437 is just plain wrong. But that shouldn't affect your system. > As you can see this problem occurs only with 3rd party software (like Komodo > Edit). > In, addition when I'm using Komodo or Notepad++ and input() function, Python > miscount bytes. See attached. > Once again thank you for your help! > Kind regards, > Aleksander Pietkiewicz My suspicion here is that your editor is saving using one encoding, and Python is expecting another. I recommend you put an encoding marker at the top of your source file: # coding=utf-8 See http://www.python.org/dev/peps/pep-0263/ for details. With this in place, you should be able to guarantee that the bytestream is parsed the same way by editor and interpreter. Unfortunately that's all I could offer; I was unable to duplicate the exact problem you were seeing. The contents of 'couting.py' are simple enough, so I'll paste here in case anyone can spot a problem: s = (input('Enter something : ')) z = input('Enter something : ') print('Length of the string s is', len(s)) print('Length of the string z is', len(z)) print(s) print(z) Point to note: On my Windows XP, the string lengths are one higher than expected, and they include a \r at the end. Is there any way that this could trigger a Unicode parse failure?? Chris Angelico