Path: csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!eternal-september.org!feeder.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Marko Rauhamaa Newsgroups: comp.lang.python Subject: Re: Python 3 is killing Python Date: Tue, 15 Jul 2014 23:01:25 +0300 Organization: A noiseless patient Spider Lines: 43 Message-ID: <87iomy4ciy.fsf@elektro.pacujo.net> References: <57ajo9poljjre4c4ig0n0ss8kph8k78lp0@4ax.com> <5389cb53$0$29978$c3e8da3$5496439d@news.astraweb.com> <99b7b2a2-7521-42d7-a5a0-1a35d4d5b922@googlegroups.com> <53C4A454.9010600@gmail.com> <87zjga4j4v.fsf@elektro.pacujo.net> <53c57bae$0$9505$c3e8da3$5496439d@news.astraweb.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Info: mx05.eternal-september.org; posting-host="ff5cf27ef3d5b31f034d3b72bdc27a41"; logging-data="29393"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+CFczykzwewLSYhH9mF7LA" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.3 (gnu/linux) Cancel-Lock: sha1:YgLq01a1SHaaaPxQXu1ORuV74wk= sha1:0lmB8Y+/krZzQWqkF1/XpJcU4qQ= Xref: csiph.com comp.lang.python:74504 Steven D'Aprano : > Unicode strings in Python 2 are second class entities. I don't see that. They form a type just like, say, complex. > It's not just that people will, in general, take the lazy way and > write "foo" instead of u"foo" for their strings. People live with their choices, and I don't see the consequences of that lazy way as very bad. In fact, I find the lazy use of Unicode strings at least as scary as the lazy use of byte strings, especially since Python 3 sneaks Unicode to the outer interfaces of the program (files, IPC). > But it is that the whole Python virtual machine is based on > byte-strings, not Unicode strings, and u"" strings are bolted on top. The internal implementation of the VM is free to change as long as the external semantics stay the same. > [steve@ando ~]$ python3.3 -c "π = 3.14; print(π+1)" > 4.140000000000001 > [steve@ando ~]$ python2.7 -c "π = 3.14; print(π+1)" > File "", line 1 > π = 3.14; print(π+1) > ^ > SyntaxError: invalid syntax My native language uses ä and ö, but I don't see any pressing need to embed those characters in identifiers. > Python 2 "helpfully" tries to guess what you want when you work with > bytes-pretending-to-be-strings, and when it guesses right, it's nice, but > when it guesses wrongly, you'll left with mysterious encoding and > decoding errors from code that don't appear to involve either. The whole > thing is a mess. I can't think of a matching example. Marko